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ANALYSIS  AND  DEVELOPMENT  OF  IMAGE  STATISTICS 
AND  REDUNDANCY  REMOVAL 

CHAPTER  I 

INTRODUCTION  AND  BACKGROUND 

Extracting  features  from  aerial  photographs  has  been  an 
important  cartographic  effort  for  many  decades.  Much  tech¬ 
nology  has  been  introduced  to  improve  and  refine  this  process, 
but  the  crucial  and  limiting  factor  has  been  the  need  for 
trained  human  operators  with  the  intelligence,  experience  and 
skills  to  recognize  and  identify  the  many  diverse  (and  some¬ 
times  unexpected)  cartographically  interesting  objects  that 
appear  in  a  typical  aerial  photograph. 

It  would  appear  that  many  of  the  functions  performed  by  a 
trained  cartographic  observer  could  and  should  be  automated. 

At  the  very  least,  a  computer  aided  human  system  would  increase 
productivity  and  decrease  the  inevitable  fatigue  and  thereby 
improve  both  the  detection  and  false  alarm  rate  of  routine  pic¬ 
ture  examination. 

The  process  of  extracting  important  features  from  a 
photograph  at  first  glance,  appears  to  be  characterized  with¬ 
in  the  broad  framework  of  the  theory  of  pattern  recognition. 
Work  on  pattern  recognition  has  been  ongoing  for  about  three 
decades  with  some  notable  successes.  These  successes  have  been 
evident  in  printed  character  recognition,  biomedical  applica¬ 
tions  such  as  cell  anomaly  identification,  and  to  an 
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extent,  in  remote  sensing.  The  application  of  pattern  recogni¬ 
tion  theories  to  cartographic  feature  extraction  is,  however, 
very  sparse  indeed. 

Much  of  the  work  on  pattern  recognition  has  been  based 
on  either 

(1)  statistical  decision  theoretic  methods 

(2)  template  matching  (matched  filters) 

(3)  syntactical  decriptions  of  geometric  objects 

The  statistical  approach  is  based  on  knowledge  of  a 
priori  probability  characterization  of  the  objects  to  be  examin¬ 
ed.  This  implies  good  statistical  behavior  of  the 
ensemble  of  pictures  to  be  examined  and  the  objects  to  be 
characterized. 

Template  matching  works  quite  well  if  again,  there  is  a 
priori  information  about  precise  shape,  size  and  orientation  of 
objects  to  be  identified. 

Syntactical  grammars  constitute  a  relatively  new,  but 
more  abstract  way  of  dealing  with  the  problem  by  translating  the 
image  into  strings  of  predefined  primitives  and  then  performing 
the  recognition  operation  on  the  picture  description  language  so 
developed.  Various  approaches  to  the  translation  process  has 
been  proposed  including  linear  strings,  tree  representations, 
webs  and  other  data  structures.  These  techniques  have  had  some 
measure  of  success  in  pictures  which  can  be  skeletonized  to  line 
drawings  or  other  forms  of  a  similar  nature.  Notable  are  the 
successes  in  chromosome  analysis. 
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It  should  be  emphasized  that  all  image  analyses  usually 
require  a  considerable  amount  of  preprocessing  such  as  linear 
transformations  and  enhancement  techniques.  Indeed,  the  vast 
majority  of  research  on  images  has  been  done  in  this  area  of 
digital  image  processing  which  by  itself  cannot  and  does  not 
yield  either  feature  extraction  or  recognition  of  objects. 

For  aerial  cartographic  analysis,  which  is  the  subject 
of  this  report,  it  is  unfortunate  that  at  this  point  in  its 
development,  very  little  of  the  classical  pattern  recognition 
theories  are  immediately  applicable.  If  we  are  to  be  success¬ 
ful  with  developing  algorithms  which  actually  work  on  real 
images,  a  good  deal  of  ad  hoc  methods  must  be  used.  Hie  reason 
for  this  is  that  it  is  rare  that  cartographically  interesting 
objects  can  be  assigned  a  priori  statistical  distributions  such 
as  is  needed  in  a  decision  theoretic  approach.  Templates  are 
not  likely  to  be  successful  because  even  approximate  shape  for 
the  same  kind  of  object  is  too  variable,  let  alone  orientation 
variability.  We  are  thus  left  to  examining  the  basic  nature 
of  the  objects  of  interest.  In  aerial  catography,  the  basic 
nature  can  be  dichotomized: 


(1)  Natural  and  gross  man-made  objects  such  as  forests, 
fields,  water,  city  streets,  etc.  These  objects 
are  properly  characterized  by  the  texture,  reflec¬ 
tivity  and  fine  structure. 

(2)  Discrete  man-made  objects  such  as  bridges,  roads, 
canals,  airports,  storage  containers,  specific  in¬ 
dustrial  sites.  These  man-made  objects  have  the 
distinguishing  feature  of  having  unique  geometries, 
shape  boundaries  and  fairly  sharp  contrast  with 
their  environment. 


The  natural  and  gross  man-made  objects  probably  will  yield 
some  degree  of  computerized  recognition  by  taking  advantage  of 
the  texture  and  fine  structure.  The  techniques  for  doing  so 
are  available  and  indeed.  Dr.  P.  F.  Chen  and  his  colleagues  at 
the  Engineer  Topographic  Laboratories  have  demonstrated  a  pre¬ 
liminary  algorithm  which  discriminates  forest,  field,  urban 
area,  water  or  none  of  the  above  using  sample  statistics  of 
sections  of  images.  The  sample  statistics  include  averages, 
correlation  and  absolute  value,  all  of  which  are  very  easy  to 
compute  and  the  results  are  compared  against  an  empirically 
determined  threshold.  Geometry  is  not  examined  and  the  algo¬ 
rithm  works  very  well  when  only  one  of  the  four  objects  are  in 
the  scanning  scene.  It  remains  to  be  determined  how  such 
algorithms,  which  depend  on  averages  of  the  gross  image,  work 
when  there  is  an  overlap  with  other  boundary  objects. 

In  this  report  we  are  concerned  principally  with  the 
second  of  the  categories  of  cartographically  important  objects, 
namely  the  discrete  man-made  ones.  This  class  of  objects  lend 
themselves  to  enhancement;  and  the  recognition  of  them  is 
enormously  benefited  by  various  types  of  digital  image  prepro¬ 
cessing  such  as  linear  transforms,  edge  detectors,  local  con¬ 
trast  changes  and  the  use  of  a  priori  knowledge  of  the  geometri¬ 
cal  characteristics  of  objects  in  question.  The  main  emphasis 
in  our  work  was  to  develop  a  set  of  working  tools  which  could 
thus  be  incorporated  into  a  transportable  algorithm  that  would 
work  on  actual  images.  To  illustrate  the  principles  and  to 
develop  a  concrete  example,  we  have  placed  most  of  our  efforts 
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on  bridge-over  water  recognition.  The  objective  was  to  have 
the  algorithm  be  effective  regardless  of  the  other  objects  in 
the  field,  even  those  that  superficially  might  resemble  a 
bridge.  In  order  to  do  this  we  developed  and  refined  a  set 
of  digital  preprocessing  techniques  such  as  edge  detectors, 
image  smoothers,  straight-line  transforms  (quantized  Hough  and 
other  variants) ,  thresholding  techniques  and  variations  on  the 
medial  axis  transform.  In  addition,  we  built  in  the  a  priori 
knowledge  of  bridge  characteristics  including  the  global  en¬ 
vironment.  The  actual  process  of  bridge  identification  then 
consists  of  applying  the  preprocessing  operations  in  sequence 
and  completed  with  a  decision  if  all  tests  are  met.  In  a  sense, 
this  procedure  can  be  viewed  as  a  syntactical  approach  when 
the  basic  language  primitives  are  the  individual  operations 
(edge  detection.  Hough  transform,  thresholding,  etc.). 

While  the  main  thrust  of  the  specific  end  product  of  this 
work  has  been  in  bridge  detection,  we  have  not  been  unaware  of 
applicability  of  our  methods  to  other  objects  such  as  airport 
runways,  roads  and  industrial  sites.  We  have  made  some  inroads 
to  the  recognition  of  these  objects  as  well.  In  addition,  we 
have  examined  certain  aspects  of  the  coding  of  line  drawings 
since  it  was  felt  that  many  of  the  man-made  objects  can  be 
skeletonized  to  line  drawings  and  that  the  coding  process  can 
be  helpful  in  their  identification. 

In  the  chapters  that  follow,  we  present  a  condensation  of 
the  effort  of  the  past  year.  The  report  is  organized  in  a  way 
to  allow  the  reader  to  examine  separately  the  models  used,  the 
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individual  routines  and  algorithms,  and  their  synthesis  in 
an  actual  application.  Standard  FORTRAN  code  is  included 
in  the  Appendix  and  a  tape  has  already  been  provided. 


( 


CHAPTER  II 


APPROACHES  TO  AUTOMATIC  CARTOGRAPHIC 
IMAGE  CLASSIFICATION 

Background 

Classification  of  a  cartographic  image  is  used  here  to 
mean  the  assignment  of  a  class  label  to  the  image  or  a  subimage. 
The  label  would  denote  a  standard  cartographic  feature — a  bridge, 
road,  river,  etc. — and  in  general  would  identify  a  set  disjoint 
from  those  corresponding  to  other  labels.  The  image  (or  sub¬ 
image — but  henceforth  all  will  be  called  images)  is  thus  a 
pattern  and  a  number  of  pattern  recognition  techniques  are  poten¬ 
tially  applicable.  The  simplest  approach  is  template  matching, 
which  compares  templates  of  prototypical  cartographic  objects 
to  the  image  being  examined.  The  template  that  is  closest  (as 
determined  with  a  suitable  metric)  is  declared  the  class  of  the 
image.  Template  matching  is  computationally  easy  and  fast;  it 
is,  however,  strongly  dependent  on  the  orientation  and  scale 
of  the  image;  the  quality  of  the  image  can  also  affect  the 
match. 

That  sensitivity  gave  rise  to  the  use  of  features  that 
jointly  characterize  the  pattern  and  that  exhibit  less  varia¬ 
tion  as  the  image  departs  from  the  prototype  or  is  corrupted 
by  noise.  Features  are  measurements  made  on  the  pattern  and 
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are  chosen  to  be  both  easy  to  extract  and  effective  in  separat¬ 
ing  the  classes.  A  set  of  n  features  defines  an  n-dimensional 
feature  space  within  which  we  want  the  patterns  to  be  well 
separated. 

An  image  in  analog  form  (e.g.,  a  continuous- tone  photo¬ 
graph)  must  undergo  several  preprocessing  operations  before 
it  is  in  a  form  suitable  for  feature  extraction.  An  overall 
block  diagram  illustrating  the  major  processes  that  are  usually 
implemented  in  a  modern  digital  image  processing  system  is 
shown  in  Figure  2.1,  and  a  general  description  of  each  of  these 
processes  follows.  The  first  process  usually  applied  to  an  in¬ 
put  picture  is  that  of  digitization,  in  which  this  original 
picture  is  converted  into  another  two-dimensional  representation 
that  is  suitable  for  digital  computer  processing.  Two  examples 
of  such  pre-processing  are  quantization  (i.e.,  analog- to-digital 
conversion)  in  both  space  and  amplitude  and  down-sampling  (i.e., 
one  way  of  reducing  the  number  of  bits  used  in  representing  the 
source) .  The  second  process  is  often  that  of  edge  detection 
and  thresholding,  whereby  the  digitized  image  is  examined  for 
large  changes  in  intensity  followed  by  a  thresholding  operation 
that  retains  only  those  edges  that  are  likely  to  be  significant 
in  subsequent  processing.  Next  is  usually  an  image  segmentation 
process,  in  which  certain  characteristics  (e.g.,  texture)  are 
utilized  to  subdivide  the  image  into  distinct  nearly  homogeneous 
regions.  These  resulting  regions  could  then  be  encoded  (as  des¬ 
cribed  below)  to  remove  the  redundant  information  within  them, 
while  still  maintaining  their  essential  attributes.  Feature 
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extraction  follows;  typical  features  in  cartographic  images  in¬ 
clude  straight  lines,  regularity  or  periodicity,  aspect  ratios, 
and  relative  sizes.  Finally,  the  features  are  used  in  a  classi¬ 
fier  that  may  evaluate  a  linear  or  nonlinear  function  of  the 
features  and  compare  it  to  a  threshold  or  use  the  features 
sequentially,  stopping  when  a  decision  can  be  made.  Many 
classifier  designs  exist  and  tradeoffs  between  feature  complex¬ 
ity  and  classifier  complexity  are  inevitable. 

Each  of  the  major  processes  that  were  briefly  discussed 
above  (in  reference  to  Figure  2.1  will  be  discussed  in  greater 
detail  in  the  subsequent  sections  of  this  report.  However,  it 
is  first  important  to  remember  what  is  the  ultimate  objective 
of  the  system  shown  in  Figure  2.1:  to  extract  and  correctly 
identify  particular  objects  that  are  of  interest  to  the  user, 
given  an  input  picture.  With  this  in  mind,  it  is  necessary  then 
to  analyze  the  list  of  objects  (or  targets)  of  interest  to  the 
user  and  attempt  to  categorize  them  according  to  their  obvious 
features.  Such  an  analysis  is  performed  in  the  following 
section. 


Preliminary  Target  Classification 
In  this  study  several  targets  of  high  interest — bridges 
and  airports— have  been  identified  for  focused  investigation. 
Table  2.1  categorizes  a  wide  range  of  cartographic  targets  in 
terms  of  their  origin,  either  natural  or  man-made.  Since  the 
target  identification  process  is  facilitated  by  identifying 
cues  or  component  attributes,  and  since  it  is  desired  to  make 


Table  2 . 1 


High-interest  Target  Set 


Natural 

Rivers  and  Streams 

Onpaved  Roads  and  Trails 

Rapids  and  Falls 

Shoreline 

Large  Rivers 

Lakes 

Forests 

Scrub 

Marsh  and  Swamp 


Man-Made 

•  isolated  Buildings 

•  Storage  Tanks 

•  Quarry  or  Borrow  Pit 

•  Tunnel  Entrance 

•  Canals 

•  Dual  Highways 

•  primary  Roads 

•  Secondary  Roads 

•  Train-Tracks 

•  Transmission  Line 

•  pipe  Line 

•  Levee 

•  Dam 

•  Bridge 

•  Airport 

•  Orchard  and  Vineyard 

•  urban  Area 

•  Suburban  Area 

•  industrial  Area 

•  Railroad  Area 


•  Cemetery 


decisions  that  distinguish  between  different  classes  of  targets, 
it  is  desirable  to  use  unique  sets  of  attributes  to  aid  in  the 
decision  process,  each  target  class  being  associated  with  several 
attributes  unique  to  its  character.  If,  for  example,  the  attri¬ 
bute  "long,  thin,  parallel  strip"  is  a  co-attribute  of  ten  of 
the  high-interest  targets,  then  it  is  of  only  minimal  value 
since  it  does  not  uniquely  identify  a  particular  member  of  the 
set  of  targets.  Of  course,  there  are  other  objects  and  shapes 
that  may  appear  in  aerial  images  that  do  not  appear  in  the  list 
of  31  high-interest  targets.  These  may  be  improperly  identified 
as  one  of  the  high- interest  candidates  if  the  attribute  set  is 
sufficiently  broad.  The  identification  of  similar  characteris¬ 
tics  and  elementary  cues  can  serve  to  structure  the  problem  and 
to  help  eliminate  the  selection  of  those  characteristics  or 
attributes  that  lend  little  to  the  decision  process. 

In  this  brief  section  the  members  of  the  high-interest 
target  set  will  be  considered  in  terms  of  their  similarities 
and  differences  by  classifying  them  into  subsets.  Generally 
the  classification  of  objects  into  natural  and  man-made  is 
done  to  identify  those  targets  that  exhibit  smooth  lines  (man¬ 
made)  versus  those  that  exhibit  irregular  lines  (natural  objects) 
(Freeman  makes  note  of  this  general  characteristic  in  his  review 
article  on  computer  processing  of  line  drawings) .  A  river  will 
exhibit,  for  example,  a  shoreline  that  randomly  curves  and 
forks.  On  the  other  hand,  a  man-made  canal  will  generally  be 
built  with  well  known  geometrical  shapes  and  will  have  smooth 
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One  of  the  factors  that  precipitated  this  brief  look  at 
the  target  classes  was  the  inclusion  of  several  very  similar 
road  or  road-like  structures  in  the  table.  If  these  are  indeed 
to  be  distinguished  from  one  another,  then  specific  characteris¬ 
tics  or  combinations  of  them  that  set  them  apart  must  be  identi¬ 
fied. 

In  the  table,  for  example,  there  are  seven  candidates  that 
exhibit  characteristics  that  are  road-like: 

•  Unpaved  Roads  and  Trails 

•  Dual  Highways 

•  Primary  Roads 

•  Secondary  Roads 

•  Train  Tracks 

•  Transmission  Lines 

•  Pipe  Lines 

All  appear  from  the  air  as  smooth-sided  ribbons  of  uni¬ 
form  luminance  or  texture;  all  exhibit  parallel  sides.  Often 
all  of  the  seven  are  built  using  long  straight  segments.  All 
include  intersections  or  forks  that  allow  interconnections  of 
two  or  more  of  their  numbers.  If  an  edge-detection  algorithm 
is  used  to  create  a  line  drawing  depicting  these  seven  objects, 
what  is  to  distinguish  them?  Perhaps  the  specific  dimensions 
can  be  used  in  some  cases,  but  r.o  clear  distinction  is  obvious 
based  only  on  this  parameter.  In  some  cases  other  attributes 
may  be  available  to  assist  in  separating  the  targets  within  a 
class.  In  the  case  of  road-like  structures,  paved  roads  may 
be  differentiated  from  transmission  lines,  pipelines,  and  rail¬ 
road  rights  of  way  based  on  the  textured  properties,  both 
within  the  parallel  line  structure  and  adjacent  to  it. 


In  summary,  all  exhibit  these  characterises: 

•  High  Aspect  Ratio  (L/W) 

•  Limited  Width 

•  Uniform  Width 

•  Smooth  Curves  or  Straight-Sided.  May 
include  sharp  angles  or  intersects 

•  Intersected  by  other  road-like  structure 

•  Not  Isolated.  Usually  continues  off  image 

•  Constrasting  Surface  with  respect  to 
surroundings 

and  these  characteristics  do  not  appear  in  any  of  the  other 
targets  (with  the  exception  of  airports  and  urban  and  suburban 
areas  that  will  be  classified  here  as  "complex"  targets,  as 
will  be  defined  later) .  This  suggests  that,  at  least  in  the 
case  of  road  and  road-like  structures,  a  multi-tiered  decision 
tree  could  be  applied  to  structure  the  identification  problem. 
Figure  2.2  shows  a  portion  of  the  decision  tree.  Only  the 
branches  associated  with  the  road-like  geometry  have  been  fully 
traced  in  the  Figure.  At  the  upper  level  are  all  target  geo¬ 
metries.  Envisioned  at  the  next  subsequent  decision  level  are 
several  algorithms  that  test  the  candidate  image  for  membership 
in  general  classes  that  include  characteristics  like  ribbon 
pattern  (parallel,  equally  spaced  lines  of  extended  length) , 
isolated  building  structure  (relatively  small,  smooth-sided 
polygon  of  uniform  luminance  or  uniform  texture) ,  water  target 
(uniform  luminance,  constrasting  with  surroundings,  relatively 
large  surface  area) ,  and  so  on.  (The  example  characteristics 
proposed  here  are  only  for  illustrative  purposes.  As  the  tar¬ 
gets  are  each  studied  in  greater  detail  the  specific  attribute 
necessary  to  perform  the  decision  process  will  be  carefully 
developed.)  At  the  next  level  of  decision-making  in  the 
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decision- tree-branch  representing  the  ribbon  structure,  the  tex¬ 
ture  may  be  tested  to  determine  if  the  candidate  is  of  uniform 
luminance  (paved  roads) ,  or  if  it  is  dull,  or  of  low  reflectance 
or  textured  (pipelines,  transmission  lines,  unpaved  roads  and 
trails) .  At  the  next  level  the  structure  may  be  classified 
further  depending  upon  the  specific  structure  of  the  logic  tree. 
In  the  case  of  the  branch  depicting  roads,  the  key  parameter 
that  separates  dual  highways,  primary,  and  secondary  roads  is 
the  width  and  number  of  parallel  ribbons  that  make  up  the  image. 

The  decision  tree  described  in  Figure  2.2  invokes  the 
notion  of  a  sequential  logic  process.  That  is  to  say,  the  lower 
level  tests  and  decisions  in  the  tree  are  carried  out  only  as 
necessary  as  the  branches  and  branch  points  of  the  tree  are 
traced  out  to  a  tip  of  the  tree.  Such  a  process  could  be  imple¬ 
mented  in  a  modular  format  using  software  modules  to  carry  out 
each  of  the  decision-related  tests.  These  modules  would  be 
called  upon  to  perform  their  specific  testing  operations  on  an 
as-required  basis  by  an  executive  function  that  would  react  to 
the  current  position  in  the  logic  tree.  Such  a  concept  would 
be  more  efficient  than  one  that  would  be  designed  to  carry  out 
all  of  the  test  operations  in  parallel.  In  the  latter  case  the 
system  would  simultaneously  perform  all  of  the  operations  in¬ 
dicated  in  the  examples  in  the  logic  tree  irrespective  of  the 
results  of  other  tests  that  may  be  performed,  in  such  a 
decision  structure  the  results  of  each  of  the  tests  could  be 
placed  in  a  feature  vector  much  like  those  indicated  in  Table 
2.2.  In  the  Table  a  number  of  example  target  characteristics 


18 


are  listed  across  the  top.  They  are  not  necessarily  the  most 
practicable  set  for  cartographic  purposes.  Each  of  the  target 
candidates  is  listed  along  the  side  of  the  matrix.  When  a  tar¬ 
get  exhibits  a  particular  attribute  a  "one"  is  entered  in  that 
attribute's  position  in  the  vector,  and  when  the  target  does 
not  exhibit  the  attribute,  a  "zero"  is  entered  in  the  attribute's 
position.  The  Table  is  minimally  constructed  (i.e.,  a  minimum 
number  of  attributes  have  been  defined)  when  all  of  the  target 
candidates  exhibit  unique  decision  vectors.  When  the  decision 
table  is  completely  constructed  then  tests  of  candidate  targets 
can  be  made,  and  the  identification  process  reduces  to  one  of 
matching  the  candidate  target's  decision  vector  with  those 
entered  in  the  Table.  As  additional  attributes  are  identified 
and  made  available  that  result  in  unique  decision  vectors,  the 
probability  of  a  correct  decision  should  increase  since  more 
information  is  being  used  in  making  the  final  identification 
decision. 

One  exercise  that  was  carried  out  during  consideration  of 
the  entire  suggested  target  set  of  31  candidates  was  a  grouping 
along  the  lines  of  eight  major  categories.  These  major  catego¬ 
ries  could  lend  themselves  to  the  logic  tree  format  discussed 
here.  Within  each  major  grouping  a  second-level  decision  would 
then  be  applied  to  further  separate  the  target  candidates.  Al¬ 
though  the  specific  characteristics  necessary  to  carry  this 
process  to  completion  are  forthcoming,  the  process  is  a  natural 
extension  of  that  described  earlier. 
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The  seven  major  target  categories  and  their  members  are 

I.  Road  and  Road-Like  (Ribbon) 

•  Dual  Highways 

•  Primary  Roads 

•  Secondary  Roads 

•  Railroad  Tracks 

•  Transmission  Lines 

•  Pipelines 

•  Trails 

•  Unpaved  Roads 

II.  Simple  Large  Scale  "Area"  Targets  (Irregular) 

•  Cemetery 

•  Orchards  and  Vineyards 

•  Quarry  or  Borrow  Pit 

•  Marsh  and  Swamp 

•  Scrub 

•  Forests 

III.  Small  "Area"  Targets  (Regular) 

•  isolated  Buildings 

•  Storage  Tanks 

IV.  Complex  (Aggregate)  Targets — Large  Scale 

•  Airport 

•  Urban  Area 

•  Suburban  Area 

•  Industrial  Area 

•  Railroad  (Switchyard)  area 

V.  Water-Bearing 

•  Rivers 

•  Large  Rivers 

•  Streams 

•  Lakes 

•  Canals 

VI.  Appurtenances  of  Water-Bearing  Bodies 
(cued  by  bodies  of  water) 

•  Levees 

•  Dams 

•  Bridges  (road  or  railroad  may  also  cue) 

•  Shoreline 

•  Rapids  and  falls 

VII.  Miscellaneous  Category 


Tunnel  Entrances 


The  discussion  above  illustrates  the  interaction  that  must 


occur  between  an  intuitive  approach  (i.e.,  one  that  would  model 
the  decision  procedure  after  that  followed  by  a  human  photo 
interpreter)  and  a  formal  one  that  relies  on  the  structure  of 
the  data.  Formal  methods  nevertheless  require  a  data  set  that 
will  be  examined  for  structure;  the  features  that  are  initially 
measured  must  represent  the  designer's  estimate  of  discrimina¬ 
tory  power.  Statistical  methods  provide  guidance  in  pruning 
the  candidate  feature  set  when  a  large  data  set  exists,  but 
the  small  data  base  used  here  makes  the  decision-tree  method 
seem  reasonable.  The  next  chapter  presents  some  tools  used  for 
extracting  those  features. 
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CHAPTER  III 


ROUTINES  FOR  BRIDGE  PATTERN  DETECTION 

The  segment  of  the  project  discussed  in  the  following  has 
as  its  goal  the  generation  of  a  small  set  of  structures  from  an 
image  that  are  "potential  bridges."  These  bridge  candidates 
are  sets  of  parallel  lines  satisfying  certain  conditions;  name¬ 
ly,  pairs  of  lines  that  are  long  and  relatively  close  together. 
Each  possible  "bridge"  is  then  subject  to  a  set  of  tests  that 
result  in  the  structure  being  labelled  "bridge"  or  "not  a  bridge" 
The  process  involves  four  primary  steps: 

(1)  Pre-processing  (if  desired  or  necessary  to 
down-sample,  reduce  noise  or  both). 

(2)  Edge  detection 

(3)  Recognition  of  long,  parallel  lines  from  the  edge- 
detected  field. 

(4)  Testing  the  resulting  lines  for  "bridge"  or  "non¬ 
bridge"  conditions. 

Each  of  these  steps  is  discussed  in  turn. 

Pre-Processing  Techniques 

Two  pre-processing  routines  have  been  developed:  image 
down-sampling  and  image  noise  reduction.  The  potential  desir¬ 
ability  of  these  two  procedures  are  clear.  For  example,  the 
original  images  being  processed  have  been  digitized  to  256 
pixels/inch.  The  image  scale,  however,  may  be  such  that  the 
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structures  to  be  detected  are  on  the  order  of  millimeters  or 
more.  Thus,  a  large  computational  advantage  may  be  gained  with 
minimal  relevant  information  loss  by  down-sampling  the  image 
to,  say,  64  pixels/inch  (a  4-to-l  down-sampling).  The  benefit 
is  a  factor  of  16  reduction  in  the  number  of  pixels  to  be 
processed  while  (as  will  be  seen)  retaining  essentially  all 
relevant  detail. 

The  algorithm  of  Appendix  3.1  allows  variable  ratio  down 
sampling  in  two  manners:  first,  by  simply  extracting  every  n-th 
pixel  in  every  n-th  row  (giving  an  n-to-1  down-sampling) ;  or 
second  by  collapsing  every  n-by-n  square  of  pixels  into  a  single 
pixel  with  a  value  equal  to  the  average  of  the  n2  pixels.  The 
first  method  is  computationally  much  faster  and  appears  to  be 
equally  effective  in  the  images  being  used.  Almost  all  of  the 
examples  shown  in  the  following  are  the  result  of  processing 
sections  of  a  4-to-l  down-sampled  image  resulting  from  method 
one.  Figure  3.1  is  the  4-to-l  down-sampled  image  section  con¬ 
taining  a  narrow  bridge. 

The  algorithm  of  Appendix  3.2  is  that  used  for  image  noise 
reduction.  One  of  two  types  of  noise  reduction  may  be  performed: 

(a)  eliminating  isolated  noise  "spikes"  via  thresholding;  or 

(b)  image  smoothing  using  one  of  three  convolutional  masks. 

Noise  spike  elimination  is  performed  as  follows:  the  value 
of  each  pixel  is  compared  to  the  average  of  its  eight  nearest 
neighbors.  If  the  absolute  value  of  the  difference  between  the 
pixel  value  and  the  average  is  greater  than  a  threshold,  the 
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Figure  3.1.  Downsampled  Digitized  Image  Containing  a  Bridge 


24 


pixel  is  set  equal  to  the  average.  The  threshold  level  may  be 
dynamically  varied  over  the  image  by  setting  it  equal  to  a  spe¬ 
cified  number  of  standard  deviations  above  the  8-neighbor 
average.  Thus,  regions  of  "smoothness"  are  compared  to  a  low 
threshold  while  rapidly  varying  regions  are  compared  to  a  high 
threshold. 

Image  smoothing  is  achieved  by  specifying  one  of  three 
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As  can  be  seen,  the  masks  differ  in  the  weighting  given  to  pixels 
nearer  the  central  pixel.  Hie  convolutions  are  all  normalized  so 
that  the  average  image  brightness  is  unchanged.  Figure  3.2  is 
an  example  of  smoothing  using  Mask  1.  As  is  expected,  the  image 
of  the  bridge  (corresponding  to  the  top  half  of  Figure  3.1)  is 
broadened  and  its  edge  contrast  reduced. 

Edge  Detection 

A  review  of  the  literature  reveals  a  large  number  of  edge 
detection  algorithms  for  digital  images,  most  falling  into  one 
of  two  categories:  template-matching  operators  (21] ,  [31] ,  and 
differential  operators  [11]  ,  [14]  ,  [33]  .  The  latter  group  in¬ 
cludes  both  2x2  pixel  operators  ("Robert's"  operator)  and  the 
3x3  pixel  operators  ("Sobel"  and  "Prewitt"  operators).  Compari¬ 
sons  of  these  techniques  [1],  [10],  [13]  in  terms  of  performance 
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Figure  3.2.  Smoothed  Version  of  Upper  Half  of  Figure  3.1 
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and  computational  simplicity  suggest  that  the  group  of  3  x  3 
differential  operators  may  be  preferred  for  the  current  task. 
Thus,  it  is  to  this  group  that  attention  was  directed. 

In  general,  the  3x3  differential  operators  are  of  the 

form: 
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With  W  =  1,  the  operator  is  known  as  the  "Prewitt"  or  "smoothed 
gradient"  operator.  With  W  *  2,  it  is  the  "Sobel"  operator. 

F  and  G  are  simultaneously  convolved  with  the  digitized  image 
and  an  edge  is  judged  to  be  present  if: 

(F  •  B)2  +  (G  .  B)2  >  A 

where  A  is  a  pre-set  threshold  value,  and  B  is  the  3x3  sub¬ 
image  currently  being  tested.  An  alternative  threshold  expres¬ 
sion  compares  the  sum  of  the  absolute  values  rather  than  the 
sum  of  the  squares,  for  computational  efficiency. 

Two  potential  "improvements"  in  the  performance  of  the 
3x3  differential  operators  have  been  proposed  by  Frei  and  Chen 
[14].  The  first  modification  suggests  using  W  *  /2.  The  result 
is  an  "isotropic"  operator:  that  is,  one  in  which  an  edge  is 
equally  likely  to  be  detected  regardless  of  its  angular  orien¬ 
tation.  To  see  this,  consider  the  detection  of  an  arbitrary 
edge  such  as  in  Figure  3.3  located  at  90°  and  45°. 
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Figure  3.3.  Detection  of  an  Arbitrary  Edge 

Centering  the  3x3  operator  over  the  central  3x3  images  (in 
the  square)  and  letting  W  =  1  ("Prewitt")  we  get: 

F-A  *  1  •  (2+1+0)  +  0 • (2+1+0)  +  (-1). (2+1+0)  =  0 

G-A  =  1  -  ( 2+2+2 )  +  0 • ( 1+1+1)  +  (-1) -(0+0+0)  =  6 

Thus:  (F-A)2  + (G- A) 2  -  02+62  =  36 

For  B:  F-B  =  1- (2+2+1)  +  0- (2+1+0)  +  (-1). (1+0+0)  =  4 

G-B*  1 - ( 2+2+1)  +  0  - (2+1+0)  +  (-1). (1+0+0)  *  4 

Thus:  (F-B2  +  (G-B)2  =  42+42  =  32 

That  is,  the  identical  rotated  45°  results  in  a  lower  edge  "mag¬ 
nitude"  and,  all  else  equal,  a  poorer  chance  of  exceeding  any 
specific  threshold.  Similarly,  with  W  =  2  ("Sobel")  we  obtain: 

(F-A)2  +  (G-A)2  =  64  (90°) 

(F-B)2  +  (G-B)  2  =  72  (45°) 

and  again  the  45°  edge  results  in  a  higher  value  (72  vs  64). 

If,  instead,  we  let  W  *  /2  ("isotropic")  it  is  observed  that: 

(F-A)2  +  (G-A)2  *  36.62 

(F-B)2  +  (G-B)2  ■  36.62 
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i.e.,  an  equal  value  results  from  the  edge  operator  at  the 
different  angles. 

The  second  potential  improvement  deals  with  the  manner 
in  which  the  edge  thresholding  is  performed.  Suppose  the  3x3 
subimage  to  be  subjected  to  the  edge  operator  is  considered  to 
be  a  vector  in  a  2-dimensional  space;  the  basis  vectors  of  this 
space  are  the  "edge"  vector  (i.e.,  F  +  G)*  and  a  vector  repre¬ 
senting  "all  else."  This  is  depicted  in  Figure  3.4.  Tradition¬ 
ally,  the  thresholding  is  performed  by  determining  the  edge  space 
component  of  B  (i.e.,  the  dot  product  of  B  with  the  edge  vector, 
or  edge  operator)  and  comparing  it  to  a  threshold  value  A.  Now 
.consider  the  situations  illustrated  by  Figure  3.5. 


Figure  3 . 4.  Thresholding 
in  Edge  Space:  Case  1 


Figure  3.5.  Thresholding  in 
Edge  Space:  Case  2 


"Actually,  the  "edge  space" 
defined  by  Frei  &  Chen  had 
2  additional  components 
(H  and  £,  at  right)  to  F 
and  G.  “in  practice,  in- 
cludTng  H  and  £  added 
little  to  the  operator's 
performance. 
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Here,  the  projection  of  B  onto  the  edge  space  (its  edge  compo¬ 
nent)  exceeds  the  threshold  A,  although  B  actually  appears  to 
be  closer  to  the  non-edge  space.  Alternatively,  C  is  very 
close  to  the  edge  space  but,  due  to  its  small  magnitude,  would 
not  be  called  an  edge  point  since  its  dot  product  with  the  edge 
space  is  less  than  A.  Neither  case  would  have  occurred  if  an 
"angular  threshold"  had  been  used;  that  is,  if  thresholding 
had  been  performed  not  on  the  magnitudes  of  B  or  C,  but  on  the 
angle  0  between  the  vector  and  the  edge  subspace. 

If:  X  -  (B.F)2  +  ( B. G) 2 

is  the  magnitude  of  the  edge  space  component  of  the  image  vector 
B,  then  e  is  calculated  as  follows; 

cos0  »  /X/ Al-B) .  (B  B)  -*•  0  *  arccos(X/(B-B)  J1/2 

If  0'  is  the  selected  threshold,  then  we  will  call  B  an  edge 
point  if: 

(B*F)2  +  (B*G)2 

arccos  - ^ —  1/2  =  T  <.  0 ' 

{ B-B)  2 

or  equivalently,  since  the  cosine  is  monotonically  decreasing 
between  0  and  if 

T  >  cos2  0 '  =  A’ 

Comparing  to  the  previous  threshold  expression,  the  net  effect 
is  that  the  original  test  statistic,  (B«F)2  +  (B*G)2,  is  divided 
by  the  magnitude  of  the  edge  "signal”  under  consideration,  and 
compared  to  a  new  threshold  A'.  From  an  intuitive  viewpoint. 
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this  modified  threshold  tests  for  the  "closeness  of  fit"  to 
being  an  edge,  rather  than  "strength"  of  the  possible  edge 
signal. 

Using  either  threshold,  the  orientation, y  ,  of  a  detected 
edge  point  is  given  by: 

Y  =  arctan  [ (G* B )/ ( B- B) ] 

A  computer  subroutine  has  been  developed  to  perform  the 
edge  detection  operation  by  convolving  the  3x3  operator  with 
an  input  image  (Appendix  3.3).  The  subroutine  calls  the  edge 
operator  (Appendix  3.3)  for  each  pixel  I  and  returns  for  each 
pixel  judged  to  be  an  edge  its  magnitude  and  orientation.  The 
value  of  W  ( 1  *  "Prewitt",  2  =  "Sobel",  SQRT(2)  =  "isotropic," 
etc.)  is  specified,  as  well  as  the  type  of  threshold  (magnitude 
or  angular)  and  threshold  level. 

Figures  3.6  and  3.7  provide  examples  of  sub images  sub¬ 
jected  to  edge  detection  using  both  types  of  thresholding. 
Thresholds  were  adjusted  until  both  methods  generated  equal 
numbers  of  edge  points.  As  can  be  seen,  the  angular  threshold 
detected  many  low  contrast  edges  in  the  water  region  which, 
although  really  existing,  are  irrelevant.  Thus,  since  the  cur¬ 
rent  task  involves  detection  of  structures  which  may  be  of 
relatively  high  contrast,  the  higher  sensitivity  of  the  angular 
threshold  to  low-contrast  edges  may  be  wasted  (or  worse,  inter¬ 
fering)  .  The  method  of  choice  for  this  project,  however, 
remains  to  be  determined  from  experience. 
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Figure  3.6.  Edge  Detection  Using  Angular  Thresholding 
(Image  segment  corresponds  to  Fig.  3.1; 
threshold  *  0.55  radians.) 
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Figure  3.7.  Edge  Detection  Using  Magnitude  Thresholding 
(Image  segment  corresponds  to  Fig.  3.1; 
threshold  -  500;  128  detected  edge  points.) 
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Line  Recognition 

The  output  of  an  edge  detection  algorithm  such  as  des¬ 
cribed  above  is,  in  general,  a  set  of  discrete,  disconnected 
edge  segments,  often  appearing  to  be  randomly  placed.  An 
example  is  shown  in  Figure  3.8.  Hie  next  test  is  usually  to 
use  a  "linking"  or  "line  building"  algorithm.  Hiese  algorithms 
generally  look  for  line  segments  that  fall  within  a  given  tole¬ 
rance  of  distance  and  orientation  of  each  other.  When  such 
segments  are  found,  they  are  linked,  forming  a  single  longer 
line  segment.  This  process  continues  until  all  long  lines  (if 
present)  are  built  up.  All  segments  which  are  below  some 
threshold  value  in  length  are  dropped. 

The  above  process,  however,  can  be  time  consuming  and,  for 
the  task  of  (for  example)  detecting  bridges  may  not  even  be 
necessary.  In  the  present  task,  we  are  not  so  much  interested 
in  line-building  as  in  the  specific  question  "are  there  long 
lines  present  in  the  image?"  and,  if  so,  "are  there  parallel 
lines  whose  lengths  are  much  greater  than  their  separation?". 
These  questions  arise  since  bridges  are,  of  course,  composed  of 
long,  close  parallel  lines.  These  types  of  questions  may  be 
answered  using  a  Hough  transform. 

Basically,  a  Hough  transform  operates  on  a  set  of  prede¬ 
termined  feature  points  in  the  image  (or  X-Y)  space.  Hie  set  of 
edge  points  resulting  from  an  edge  detection  operation  is  such 
a  set  of  feature  points.  Hie  Hough  transform  uses  these  points 
to  generate  a  set  of  points  in  rho- theta  space,  where  the  rho 
and  theta  values  are  coordinates  of  a  line  in  X-Y  space.  Hiat 
is,  it  is  a  line-to-point  transformation. 
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Figure  3.8. 


Image  With  Apparently  Random  Structure 
(Lower  half  is  edge-detected  image. 
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To  explain  this  transformation,  assume  that  some  line  L 
exists  in  the  image  (X-Y)  space).  This  line  can  be  described 
by  two  coordinates:  rho(R),  the  perpendicular  distance  of  the 
line  to  the  origin  (which  may  be  selected,  for  example,  to  be 
the  lower  left  corner  of  the  image),  and  theta  (0),  the  angle 
of  the  perpendicular  with  the  X-axis  (Figure  3.9).  Now,  let 
B  be  an  edge  point  of  L  detected  by  an  edge  operation.  B  is 
thus  a  "feature  point".  An  infinite  number  of  lines  may  pass 
through  this  point.  Suppose,  however,  that  we  quantize  the 
angles  of  the  candidate  lines  into,  say,  eight  values  between 
0  and  180°  (22.5°  increments).  Thus,  we  allow  one  of  eight 


Figure  3.9.  Hough  Transform  Definition 
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possible  lines  to  pass  through  B  (Figure  3.10a).  We  now  plot 
the  coordinates  of  each  of  these  eight  lines  in  (R -0)  space 
(Figure  3.10b).  In  general,  the  result  is  a  curve  as  in  Figure 
3.10b. 


rho 


rho 


Theta 


Figure  3.10.  Hough  Transform  Space 


Now,  suppose  we  have  quantized  rho  into,  say,  n  levels  so  that 
(R-0)  space  is  represented  by  a  2-dimensional  (R-0)  matrix 
Figure  3.11).  We  now  perform  the  plotting  procedure  for  each 
of  the  8  possible  lines  by  incrementing  (by  1)  the  appropriate 
cell  in  the  (R-0)  matrix.  An  identical  process  is  performed 
for  every  feature  point  in  X-Y  space  (i.e.,  every  detected 
edge  point).  The  final  result  is  a  matrix  (in  which  each 
element  defines  a  particular  (R-0)  pair)  whose  entries  are 


37 


equal  to  the  number  of  times  each  cell  was  incremented.  That 
is,  a  cell  with  a  value  of  20  implies  that  20  edge  points  were 
detected  each  of  which  has  one  of  its  eight  possible  lines  posses¬ 
sing  those  (R-0)  coordinates.  Since,  of  course,  all  lines  with 


rho 


Figure  3.11.  The  R-0  Matrix 

the  same  coordinates  are  collinear,  a  (R-0)  matrix  element  with 
a  high  value  is  the  same  as  saying  that  a  large  number  of  fea¬ 
ture  (i.e.,  edge)  points  lay  along  the  same  line.  Thus  (R-0) 
elements  with  values  above  some  threshold  are  judged  to  be 
lines  (or  edges)  that  exist  in  X-Y  space. 

Cells  with  the  same  value  of  0  but  differing  R's  repre¬ 
sent  parallel  lines  in  the  image.  Cells  exceeding  the  thresh- 
hold  that  have  the  same  0  with  R's  that  are  close  together  define 
image  lines  that  are  long,  parallel  and  close  together.  These 
are  our  "potential  bridges"  (Figure  3.12). 

Although  more  to  the  point,  the  above  process  is  still 
calculation-intensive  with  fine  quantization,  even  if  the 
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Figure  3.12.  Example  of  a  Hough  Transform 
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number  of  feature  points  is  relatively  small.  However,  if  the 
feature  points  are  generated  by  the  edge  detection  mechanism 
previously  described,  then  we  are  not  using  all  of  the  informa¬ 
tion  at  hand.  That  is,  we  already  have  an  estimate  of  the 
orientation  of  the  line  passing  through  each  edge  point:  namely, 

Y  =  arctan[ (F. B)/(G-B) ] 

where  F,G,  and  B  are  as  defined  before.  Thus,  rather  than  cal¬ 
culating  rho  and  theta  for  a  number  of  possible  lines  (the 
number  depending  on  the  quantization  of  theta)  and  incrementing 
each  of  the  appropriate  (R-0)  cells,  we  do  it  only  for  that  line 
most  likely  to  pass  through  each  feature  point  (i.e.,  that  with 
coordinates  R,Q)  [15].  Since  we  already  know  0,  we  calculate 
rho: 

R  *  X-cos0  +  Y-sin0 

where  X  and  Y  are  the  image  space  coordinates  of  the  feature 
point.  The  net  result  is  a  single  addition  calculation  (of  R) 
for  each  feature  point  to  obtain  the  Hough  transform,  and  a 
far  more  efficient  way  of  generating  the  required  "line  exis¬ 
tence"  information. 

A  subroutine  to  perform  the  Hough  transformation  in  this 
manner  is  given  in  Appendix  3.3.  The  program  calls  the  edge 
operator  for  each  pixel  after  which  R  and  0  are  calculated  and 
the  appropriate  cell  incremented.  Hius,  this  routine  performs 
image  edge  detection  and  Hough  transformation  simultaneously. 

Two  weaknesses  of  the  Hough  transform  become  evident  when 
it  is  applied  to  a  large  input  image:  first,  that  a  large 
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amount  of  memory  is  required  to  store  the  (R-Q)  matrix  if  fine 
quantization  of  rho  is  desired;  and  second,  that  no  information 
is  provided  about  where  in  the  image  the  lines  exist  (i.e.,  no 
end  points  are  provided).  These  difficulties  may  be  avoided 
by  using  the  following  scheme:  divide  the  image  into  a  number 
of  small  (for  example  32  x  32  pixel)  sub- images.  Then  operate 
on  these  subimages  individually.  The  advantages  are  as  follows: 

(1)  since  the  distances  in  X-Y  space  covered  by  the  sub¬ 
images  will  be  small,  coarse  quantization  of  rho  will 
be  sufficient  to  maintain  fine  spatial  detail.  (Quanti¬ 
zation  of  0  is  already  limited  by  the  accuracy  of  the 
edge  detection  angle  calculation.) 

(2)  again,  since  the  spatial  distances  involved  are 
relatively  small,  the  question  of  "where  are  the  line 
endpoints"  is  less  important  since  it  is  limited  to 
the  range  covered  by  subimage. 

As  will  be  described  next,  the  final  result  of  this  scheme  is 
a  relatively  small  set  of  "bridge  candidates"  (long,  close 
parallel  lines) .  These  may  then  be  conveniently  subjected  to 
a  series  of  tests  or  further  analyses  to  determine  its  "bridge" 
or  "non-bridge"  status. 

Figures  3.13  and  3.14  give  Hough  transforms  and  inverted 
Hough  transforms  following  cell  thresholding,  for  subimages 
using  both  types  of  edge  detection  described  previously  to  gene¬ 
rate  feature  points.  In  both  cases,  the  bridge  candidates 
clearly  stand  out  and  are  easily  isolated,  although  as  before, 
there  are  fewer  irrelevant  entries  resulting  from  the  edge 
magnitude  thresholding  process. 

Additional  Analysis 

Although  work  on  this  step  has  just  been  started,  the  pro¬ 
cedure  is  relatively  straightforward  once  the  bridge  candidates 
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Figure  3. 13.  Example  of 
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Theta 


Figure  3.14.  Example  of  the 
Hough  and  Inverse  Hough 
Transforms  with  a  Narrow 
Bridge.  Hough  (right)  and 
inverse  (below)  based  on 
upper  part  of  Fig.  3.1. 

The  boundaries  and  shadow 
border  of  the  narrow 
bridge  have  been  extrac¬ 
ted. 
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have  been  isolated.  In  particular,  one  could  do  the  following: 

(1)  decompose  the  image  into  textured  segments  (corres¬ 
ponding  to,  for  example,  water,  land-urban,  and 
land-rural  via  one  of  several  proposed  algorithms 
[17],  [18],  [37].  Then,  long  parallel  lines  over, 
say,  water,  are  very  probably  bridges. 

(2)  perform  a  series  of  simple  "environmental"  tests  on 
each  candidate  to  determine  its  context.  For  example, 
one  could  determine  a  "global  contrast",  as  a  measure 
of  the  mean  and  variance  of  a  block  of  pixels  on  either 
side  of  the  lines.  Those  over  water  would  tend  to  a 
uniformity  and  equality  (i.e.,  similar  means  and  small 
variances)  for  blocks  or  pixels  on  either  side,  since 
these  would  correspond  to  water.  Another  simpler  pro¬ 
cedure  is  to  calculate  the  "edginess"  of  the  region 
around  a  bridge  candidate.  Edginess  is  taken  to  be 
the  number  of  detected  edge  points  in  the  region 
divided  by  the  total  number  of  points  in  the  region. 

(The  edge  detection  algorithm  of  Appendices  III-3  and 
III-5  determine  this  number  automatically  for  the  region 
contained  within  the  subimage  processed) .  Edginess 

is  a  common  textural  statistic  and,  again,  we  would 
expect  the  water  regions  (i.e.,  subimages  containing 
bridges  over  water)  would  have  a  much  lower  amount 
of  edginess  than  other  regions,  due  to  its  high 
uniformity. 


As  an  example,  we  shall  use  the  edginess  value  defined  above 
to  attempt  a  classification  of  four  potential  bridges.  As 
before,  these  candidates  are  taken  to  be  sets  of  long,  close 
parallel  lines  extracted  from  image  segments.  The  four  are 
those  from  Figures  3.13  and  3.14  (one  each),  plus  those  illus¬ 
trated  by  the  image  and  edge-detected  images  of  Figures  3.16 
and  3.17.  The  Hough  transform  matrices  of  these  images  are 
shown  in  Figure  3.15  with  the  bridge  candidates  circled.  From 
experience,  we  may  set  the  edginess  threshold  at,  say,  0.20. 
Those  candidates  located  in  images  with  edginess  values  less 
than  0.20  will  be  judged  as  being  over  water,  and  therefore, 
bridges.  Results  are  shown  in  Table  3.1. 


(a) 


Figure  3.15  Hough 
Transform  Matrices 


(a)  For  Fig.  3.16. 

(b)  For  Fig.  3.17. 


The  potential 
bridges  (long, 
close,  parallel 
lines)  are  circled. 


(b) 
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Figure  3.16.  Raw  Image  and  Edge-Detected  Version:  Highway 
and  Overpass 
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Figure  3.17.  Raw  Image  and  Edge-Detected  Version:  Bridge 
Over  River 
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TABLE  3.1 

Edginess  of  Bridge  Candidates 


#  edge  points 

Candidate  (excluding)  Thres-  Classifi- 

from:  bridge  pts)  Edginess  hold  cation  Actual 


Figure 

3.13 

406 

0.41 

0.20 

Non-bridge 

Highway 

Figure 

3.14 

84 

0.09 

0.20 

Bridge 

Bridge 

Figure 

3.16 

313 

0.31 

0.20 

Non-bridge 

Highway/ 

Overpass 

Figure 

3.17 

131 

0.15 

0.20 

Bridge 

Bridge 

Summary 

Bridge  candidates  are  extracted  from  raw  digitized  images 
by  performing  a  Hough  transform  on  an  edge  detected  image  and  keep¬ 
ing  only  those  entries  corresponding  to  long  parallel  lines  that 
are  close  together.  These  are  represented  by  matrix  entries  in 
the  same  column  (same  angle)  and  in  rows  that  are  adjacent  or 
nearly  adjacent.  Further  testing  is  then  done  on  this  small  set 
of  potential  bridges  to  determine  whether  or  not  a  bridge  is  pre¬ 
sent.  An  example  using  image  edginess  in  the  neighborhood  of 


the  candidate  is  given 


CHAPTER  IV 


THE  DISCRETE  MEDIAL  AXIS  TRANSFORM 
Introduction 

The  Medial  Axis  Transform  (MAT)  or  "prairie  fire"  skeleton 
of  an  image  was  first  defined  by  Blum  [1]  in  1964.  There  was 
a  flurry  of  activity,  both  theoretical  and  practical,  for  the 
next  6  years,  ending  about  1970,  (see  refs.  [3]-[8],  [19], 

[20],  [22] - [26] ,  [29],  [32],  [34]).  Since  1970,  the  MAT  has 
been  mentioned  in  some  texts  ([9],  [11],  [30],  [36]),  and  some 
investigators  have  used  it  with  success  in  certain  problems 
([28],  [38]).  Part  of  the  problem  with  the  MAT  which  caused  a 
decline  of  interest  in  1970  was  that  the  existing  algorithms 
were  either  difficult  to  understand,  or  they  ran  too  long  on 
the  computers  then  available.  Moreover,  the  results  were  highly 
dependent  upon  the  orientation  of  the  object  being  transformed; 
that  is,  the  algorithms  were  very  sensitive  to  rotation  in  the 
plane.  More  recently.  Wall  and  his  associates  [38]  have  employed 
an  algorithm  essentially  identical  to  the  one  described  later 
in  this  paper,  and  Pavlidis  [28]  has  used  the  MAT  as  the  basis 
for  defining  a  variety  of  shape  descriptors  for  use  in  pattern 
classification  schemes. 

The  present  study  was  provoked  by  a  need  to  find  feature 
extraction  techniques  which  could  be  used  to  identify  bridges  in 
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aerial  photographs.  As  noted  earlier,  one  of  the  features  of  a 
bridge  in  a  photograph  that  is  potentially  useful  as  a  discri¬ 
minator  in  a  classification  scheme  is  the  aspect  ratio.  Another 
useful  feature  is  the  length  of  the  bridge.  Virtually  all 
bridges  are  long,  and  have  aspect  ratios  which  are  high.  Thus, 
a  test  of  these  two  features  can  eliminate  many  objects  from 
further  consideration  in  an  automatic  bridge  identification 
scheme. 

Both  the  aspect  ratio  and  the  length  of  an  object  are 
easily  extracted  from  the  object's  MAT  by  simple  syntax  tests. 

The  definitions  of  a  grammar  and  the  test  itself  are  not  dis¬ 
cussed  here.  Instead,  the  computation  of  the  MAT  in  a  binary 
image  by  thinning  is  considered,  and  extensions  to  grayscale 
images  are  examined. 

A  Binary  Thinning  Algorithm 

The  "prairie  fire"  definition  of  the  MAT  can  be  used  to 
generate  an  algorithm  for  the  computation  of  the  MAT.  For  ex¬ 
ample  the  white  areas  of  a  binary  image  can  be  skeletonized  or 
thinned  quite  easily  by  application  of  the  thinning  algorithm 
described  below.  Black  areas  can  be  thinned  by  complementing 
the  image  both  before  and  after  application  of  the  thinning 
algorithm. 

This  thinning  algorithm  operates  by  using  a  3  x  3  pixel 
mask  defined  as 

ABC 
D  E  F 
G  H  I 

where  the  candidate  for  replacement  is  the  element  lying  under  E. 


The  original  image  is  copied,  and  then  the  mask  is  moved  over 
the  original  image  of  size  m  x  n  such  that  the  mask  does  not 
extend  past  the  edges  of  the  original  mask.  Thus,  the  thinning 
algorithm  does  not  have  any  effect  on  the  border  pixels  in  the 
original  image. 

Initialization  is  effected  by  setting  a  flag  and  defining 
a  quantity  called  "pass  number"  and  setting  it  equal  to  1. 

Then  the  mask  is  slid  across  the  image  beginning  with  element 
(2,2)  under  E  and  ending  with  element  (m-l,n-l)  lying  under  E. 

At  each  point,  a  test  is  made  to  see  if  the  element  lying  under 
E  is  a  1.  If  not,  the  mask  is  slid  to  the  next  position.  If 
the  element  under  E  was  a  1,  then  a  test  is  made  of  the  set  of 
elements  lying  under  mask  elements  B,  D,  F,  and  H.  If  two  or 
three  of  these  elements  are  l's,  then  the  path  between  each 
pair  of  l's  is  examined  to  see  if  they  are  connected  by  l's 
where  element  E  is  not  an  allowed  path  member.  For  each  pair 
there  are  2  paths  around  the  perimeter  of  the  mask.  In  order 
to  pass  the  test  of  connectivity  between  l's,  only  one  of  these 
paths  need  be  all  l's.  If  the  test  is  passed  by  each  pair  of 
l's,  then  the  element  lying  under  E  in  the  original  image  is 
replaced  with  a  0  in  the  copied  image  and  the  flag  is  reset. 

If  there  were  not  two  or  three  l's  in  the  set  B,  D,  F,  H,  or 
if  the  connectivity  test  is  failed  by  any  pair  of  l's,  then 
the  mask  is  slid  to  the  next  position  with  no  action  taken  on 
element  E.  At  the  conclusion  of  each  pass  over  the  image,  the 
flag  is  tested.  If  it  is  reset,  the  pass  number  is  incremented, 
the  original  image  is  replaced  with  the  copied  image,  the  flag 
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is  again  set,  and  the  algorithm  proceeds  anew.  If  the  flag  has 
not  been  reset,  then  no  elements  were  replaced  in  the  preced¬ 
ing  pass,  and  the  algorithm  halts  by  outputting  the  copied 
image. 

The  operation  of  the  algorithm  is  shown  by  Figures 
4.1,  a,  b,  and  c. 


Computation  of  MAT  by  Thinning 

The  conversion  of  the  binary  thinning  algorithm  just  des¬ 
cribed  to  the  computation  of  the  medial  axis  transform  is  quite 
easy.  The  only  additional  work  required  is  bookkeeping:  i.e., 
a  list  must  be  maintained  whose  elements  are  certain  of  the 
replaced  elements  in  the  image  being  thinned.  The  criterion 
for  inclusion  in  this  list  is  that  the  element  being  replaced 
have  two  members  of  its  set  of  neighbors,  B,  D,  F,  and  H  equal 
to  zero.  This  is  the  defining  relation  for  a  skeleton  element. 

The  actual  list  consists  of  more  than  just  elements.  It 
has  four  fields  for  each  list  member.  The  first  two  fields  are 
the  coordinates  of  the  replaced  element.  The  next  field  is  the 
pass  number  in  which  the  replacement  was  effected.  The  fourth 
field  is  of  variable  length  and  consists  of  a  coded  representa¬ 
tion  of  where  the  neighbors  of  the  replaced  elements  were  1' s 
in  the  set  B,  D,  F,  and  H.  The  reason  that  this  field  is  of 
variable  length  is  that  the  MAT  of  an  object  computed  by  this 
algorithm  consists  of  all  the  replaced  elements  in  union  with 
the  elements  remaining  as  l's  in  the  thinned  image.  These 
elements  are  assigned  pass  number  0  in  the  list,  and  they  have 
no  neighbors  when  replaced  since  they  were  never  replaced. 
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Figure  4.1.  Test  Image;  a)  Original  Image;  b)  After  First 
Pass  of  Thinning  Algorithm;  c)  Thinned  Image 
After  Third  Pass;  d)  Skeleton  of  Image  from 
MAT  Algorithm. 
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The  coding  for  the  sequence  is  based  upon  the  well  known 
chain  code  mask  [30]. 

7  0  1 
6  2 
5  4  3 

As  an  example  of  the  application  of  this  algorithm,  con¬ 
sider  the  image  shown  in  Figure  4.1a.  The  skeleton  of  this 
image  is  shown  in  Figure  4. Id  and  its  MAT  in  list  form  is  shown 
below: 

2  2  1  24 

2  5  1  46 

4  6  1  46 

5  2  1  02 

5  3  1  06 

7  5  1  02 

7  6  1  06 

3  3  2  24 

3  4  0  — 

4  3  0  — 

4  5  0  — 

6  6  0  — 

6  7  0  — 

The  basis  of  the  algorithm  is  the  prairie  fire  concept. 

A  fire  lit  simultaneously  at  all  points  on  the  perimeter  of  an 
area  will  burn  towards  the  interior  from  the  perimeter  as  a 
wave  would  propagate  except  that  there  is  no  superposition. 

The  points  where  the  propagating  fire  fronts  meet  are  called 
quench  points  and  the  distance  to  the  perimeter  from  each 
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quench  point  is  called  the  quench  distance.  The  set  of  all 
quench  points  is  the  skeleton  of  the  object  and  the  set  of  all 
quench  points  and  quench  distances  is  the  medial  axis  transform 
of  the  object. 

In  the  discrete  plane,  the  prairie  fire  concept  can  be 
implemented  by  thinning  the  points  on  the  edge  of  an  object  and 
keeping  track  of  those  points  which  have  two  edges,  i.e.,  cor¬ 
ner  points,  as  the  object  is  repeatedly  thinned.  Keeping  track 
of  the  orientation  of  the  corners  allows  reconstruction. 

In  the  continuous  plane,  the  MAT  is  s  set  of  connected 
line  segments  made  up  of  straight  lines  and  arcs  of  parabolas 
[25].  In  the  discrete  plane,  the  line  segments  are  not  neces¬ 
sarily  connected  as  the  example  in  Figure  4.1  shows,  (see  also 
[9],  page  331).  Moreover,  there  is  some  visible  distortion  of 
the  MAT  as  shown  in  Figure  4.2b  which  is  a  function  of  the  scan 
directions  as  the  mask  is  passed  over  the  image.  This  factor 
must  be  considered  in  any  grammar  which  is  defined  to  permit 
syntax  testing  of  the  skeleton  or  of  the  MAT  itself. 

The  MAT  in  the  discrete  plane  is  also  somewhat  sensitive 
to  rotation,  as  is  discussed  in  [9].  This  sensitivity  is  not 
significant  if  the  definition  of  the  grammar  takes  it  into 
account.  Hence,  it  will  not  be  further  considered  in  this 
paper  inasmuch  as  it  will  be  dealt  with  in  a  future  paper  on 
grammars  for  syntax  testing  of  MAT'S. 

Montanari's  Method 

Montanari  [24]  formulates  the  determination  of  the  dis¬ 
crete  MAT  as  an  optimal  policy  problem,  that  is,  solving  the 
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problem  of  finding  the  shortest  path  through  a  network.  For 
the  discussion  that  follows,  one  must  realize  that  Montanari 
computes  external  MAT'S,  that  is,  MAT'S  which  lie  outside  the 
boundaries  of  the  objects.  To  arrive  at  MAT'S  which  have  a 
finite  set  of  points  on  the  skeleton,  i.e.,  bounded  MAT'S, 
Montanari  employs  "hollow”  objects  for  his  examples,  that  is, 
he  computes  the  MAT  for  either  unbounded,  simply  connected 
objects,  or  for  multiply  connected  objects  and  the  MAT  lies  in 
the  bounded  exterior  of  the  object. 

He  begins  by  defining  a  reticular  network  of  order  n  as 
one  where  the  slopes  of  the  lines  connecting  the  center  of  the 
network  to  the  vertices  in  the  first  octant  form  a  Farey  se¬ 
quence  of  order  n  (defined  as  the  ordered  sequence  of  all 
rational  numbers  between  0  and  1  with  denominators  less  than 
or  equal  to  n)  as  shown  in  Figure  4.3. 

He  then  defines  three  sets.  Set  U  is  the  set  of  all  the 
vertices  which  are  in  one-to-one  correspondence  with  the  coor¬ 
dinates  of  the  pixels  in  the  image,  i.e., 

U  =  [P'  1  1  <.  Xp  <  r;  1  <.  yp  <.  s;  (xp,yp)  =  pixel  coordinates]  . 
set  is  the  set  of  vertices  defining  the  object: 

I  *  [P ' I  a(P )  *  .True. ] 

i.e.,  the  set  corresponding  to  the  pixels  belonging  to  the  ob¬ 
ject.  Set  E  is  the  complement  of  the  set  1^,  i.e.,  E  ■  0  -  I. 

Also  defined  is  an  extra  vertex,  PN' ,  which  is  assumed  to  be  con¬ 
nected  to  every  vertex  I  by  an  arc  of  length  zero. 
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I 

I 

i 

i 

Now  for  every  e  u,  there  is  a  minimum  arc  length  con¬ 
necting  p^'  to  P^'  which  lies  on  a  set  of  network  rays.  This 
minimum  distance  Montanari  calls  the  quasi-Euclidean  distance 
since  from  any  n,  n  finite,  Dq(Pj.,P2)  <  db^p1^p2^  and  *n  the 
limit  as  n  -*■  «,  ■+•  DE  where  Dq  is  the  minimum  distance  along 

the  network  paths  and  DE  is  the  usual  Euclidean  distance. 

De  =  [(xx  -  x2)2  +  (yx  -  y2) 2 ] 1/2  (i: 

A  point  P^  can  be  defined  to  be  a  skeleton  point  if  and 
only  if  Pi'  e  E  and  it  does  not  lie  on  a  minimal  path  from  any 

other  vertex  Pj  '  to  PN'.  Thus,  for  every  vertex  P^',  the  mini¬ 

mum  path  to  PN'  is  found  and  all  the  Pj ' ,  j  /  i,  lying  on  this 
path  are  removed  from  further  consideration  since  they  cannot 
be  skeleton  points.  The  length  of  the  path  is  also  found  and 
is  associated  with  P^'  as  a  function  T(P^').  The  set,  £>,  of 
all  P^'  which  remains  after  the  entire  set  U  has  been  examined 
is  the  skeleton  and  the  set  of  all  the  T(P^')  defines  the 
quench  function. 

Montanari  determines  the  difference  between  the  quasi- 
Euclidean  distance  and  the  true  Eucludean  distance  by  comput¬ 
ing  the  error,  e,  where 

T(PX'  ,  P2’ )  -  Dg(pi '  ,  P2' ) 

_  e  =  - 

T(P^ '  ,  P2' ) 

where  P^'  and  P2 '  are  any  two  points  in  the  plane. 


58 


Montanari's  Algorithms 

Montanari  has  devised  two  algorithms  to  compute  the  dis¬ 
crete  MAT.  His  first  approach  is  to  solve  the  set  of  equations 

=  min(t^j  +  T  j )  ^  a  1»2 1 » N— 1 ;  j  =  1,2,«..,N,  j  ^  i  (3a) 

TN  =  0  (3b) 

where  if  the  arc  connecting  vertices  '  and  P j '  exists,  t^j  is 
the  length  of  this  arc;  otherwise  t^j  ■  00 ,  and  where  i  is  an 
index  indicating  the  forward  raster  sequence  number.  The 
initial  condition  is  that  T^  =  tiN.  Then  the  iterative  formula 
is 

T*  =  min  (ttj  +  Tj,  tir  +  T*-1)  i  =  1,2,...,N-1  (4) 

V  e  sKi 

Pc'  E  — i2 

k  k 

where  E^  is  the  set  of  vertices  P j '  for  which  the  value  Tj  was 

t  h  k  k 

computed  in  the  k  n  iteration,  and  E^2  *  U  ”  E^i i  [P i  1  -  The 

iteration  stops  when  -  t?  =  T^  for  all  P^. 

Now  the  order  m  which  the  t£  are  computed  is  arbitrary. 
However,  the  rate  of  convergence  is  a  function  of  the  order  of 
computation.  Montanari  has  found  an  order  of  computation  for 
which  convergence  is  achieved  in  only  two  iterations.  The 
secret  to  this  rapid  convergence  is  the  fact  that  the  image  is 
a  rectangular  array  and  that  information  gained  during  the  first 
iteration  is  used  to  reduce  the  computation  necessary  during 
the  second  iteration. 
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The  way  this  works  is  that  by  scanning  the  array  in  a 

forward  raster  sequence  in  the  first  iteration  and  with  a  back- 

2 

ward  raster  sequence  in  the  second  iteration,  Tj  =  Tj  for  all 
Pj'  in  two  iterations.  Moreover,  he  only  considers  P^'  e  E  since 
for  P^  e  _lf  =  0,  and  in  the  first  iteration,  he  can  ignore 

A 

the  vertices  Pr'  e  because  Tr  =  00 .  Other  shortcuts  are  based 

upon  the  fact  that  the  set  E^j_  is  empty,  hence  =  00  .  and  that 

2  1  12 
E^2  =  =  E^i  and  that  E^  “  E^i  =  E^  so  that  direct  use  of 

T?"  can  be  made  in  the  second  iteration. 


The  algorithm  is  as  follows: 
a)  Let  tJ  =  0  if  Pt  £  I 


Tj_  =  “  if  P]_  e  E 


«  min  ( t± j  +  Tj)  if  Pt  e  e 


(5a) 

(5b) 

(5c) 


pj’£  £il 


where 


£il  “  *  3  <il 


b)  Let 

f™ 

min  (t^j 

2  1 
+  TT,  TT) 

D  l 

i  =  n- 

Tl2  *  < 

1 Pj 

'  £  Ei2 

if 

p.  •  e 

l 

( 

J2) 

0 

if 

V  e 

where  E^2  *  [Pj  '  1  j  >  i 

] ,  and  U  = 

E  V  I  *  Eix 

v  *i2 

c)  Let  S  ■  [P^  |  Pk'  e  E  for  every  p^'  directly  con- 


2  2 

nected  to  P^' ,  Tk  4  Tj_  -  tik] 


(7) 


a)  Associate  to  every  P^  e  S  the  parameter  T^, 
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As  an  example  of  Montanari's  first  algorithm,  consider 
the  example  of  Figure  4.1a.  We  first  complement  this  image  to 
yield  the  image  shown  in  Figure  4.4a.  Then  we  "countersign" 
all  the  points  which  are  in  the  set  _I  with  0's  and  leave  the 
remaining  points  blank,  as  shown  in  Figure  4.4b.  Then,  employ¬ 
ing  the  reticular  grid  of  order  n  =  2,  we  have  after  the  first 
iteration  the  representation  shown  in  Figure  4.4c.  The  second 
iteration  produces  the  representation  shown  in  Figure  4.4d. 

We  then  examine  all  the  P^' e  E  for  the  test  specified  in  (7), 
and  assign  to  each  survivor  P^'  its  associated  value  T^.  The 
resulting  MAT  is  shown  in  Figure  4.4e.  Note  that  this  MAT  is 
fuller,  i.e.,  has  more  points  on  the  skeleton  than  does  the  MAT 
in  Figure  4. Id.  Also  different  are  the  distances  or  quench 
function.  This  is  to  be  expected  since  the  local  vicinity, 
i.e.,  mask,  in  the  algorithm  used  to  compute  the  MAT  in  Figure 
4.4e  is  larger  than  that  used  to  compute  the  MAT  of  Figure  4. Id. 
In  order  to  make  a  more  realistic  comparison  between  these  two 
algorithms,  the  first  Montanari  algorithm  was  reapplied  with 
n  =  0.  The  results  are  shown  in  Figure  4.4f. 

We  note  that  this  MAT,  while  more  similar  to  the  MAT  of 
Figure  4. Id  than  the  one  computed  with  n  ■  2,  is  still  diffe¬ 
rent.  Thus,  the  choice  of  algorithm  is  another  factor  which 
must  be  considered  in  the  design  of  a  grammar  for  describing 
the  syntax  of  the  MAT. 

Montanari's  second  algorithm  is  a  Dantzig  (one-pass) 
algorithm.  The  algorithm  proceeds  as  follows: 
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.  Montanari's  First  Algorithm:  a)  Original  Image; 
b)  Countersigned  Image;  c)  After  First  Iteration, 
n«2,  d)  After  Second  Iteration,  n*2;  e)  MAT  for 
n»2;  f)  MAT  for  n*0. 


Figure  4.4 
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(1)  Examine  all  P^'  to  find  tkN  such  that  tkN  =  min  tjN. 

Obviously  TN  =  0,  and  Tk  =  tkN.  Then  define  the  sets 

II  *  tP»l .  ll  ’  2  -  Ei- 

(2)  Compute  the  values 

Th  *  <*h*  +  V  ¥  V  £  £  <8> 

px’  e  ^1 

( 3 )  Compute 

EfU  =  E  J  =  [Pk'J  (9a) 

E2+1  =  Eg  -  EPk']  (9b) 

If  E|+1  is  empty,  stop;  otherwise  repeat  steps  (2)  and  (3). 
For  the  vertices  Pk  accepted  in  the  mth  step,  remember  the  ver¬ 
tex  (or  vertices)  Px'  for  which  (t^x  +  Tx)  is  the  least  in  (8), 
i.e.,  for  which  Tx  *  Th  -  thx,  an(3  countersign  all  the  Px'  that 
are  encountered  in  this  manner.  The  non-countersigned  vertices 
belong  to  the  set  S. 

The  algorithm  is  demonstrated  for  n  =  0  by  again  consider¬ 
ing  the  image  in  Figure  4.1a.  The  intermediate  operations  and 
the  results  are  shown  in  Figure  4.5.  We  note  that  Figure  4.5c 
is  identical  to  Figure  4.4f,  demonstrating  the  equivalence  of 
these  two  algorithms. 

Montanari  also  proposes  a  method  of  eliminating  insignifi¬ 
cant  skeleton  points  by  introducing  a  threshold  function  into 
his  algorithms.  For  the  first  algorithm,  this  involves  changing 
the  criterion  for  inclusion  in  S  as  follows; 
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Figure  4. 
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.  Montanari's  Second  Algorithm  for  n«*0;  a)  Original 
Image;  b)  After  Application  of  Algorithm;  c)  MAT 
Resulting  from  Algorithm. 
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S  =  [Pi  I  Pi'  s  E;  for  every  P j '  ,  Tj  -  Ti  <  (10) 

and  for  the  second  algorithm,  countersigning  the  vertices  for 
which  Tx  <_  -  k-t^,  where  K  is  the  threshold  coefficient, 

0  <.  K  <  1. 

The  major  use  of  this  threshold  function  is  to  eliminate 
skeleton  noise  introduced  when  n  >  0.  Montanari  suggests  as  a 
rule  of  thumb  that  K  <_  0.70,  and  in  his  paper  [24]  he  gives  an 
example  for  three  values  of  the  threshold  showing  0.70  to  yield 
the  best  results. 

Montanari *s  Gray  Weighted  Skeleton 

Montanari  and  Levi  [23]  are  able  to  extend  Montanari' s 
algorithms  for  finding  MAT'S  of  binary  images  to  gray  scale 
images  by  defining  a  new  metric  for  measuring  distances  in  the 
reticular  networks.  They  do  this  by  first  specifying  some  gray 
function  f(x,y)  in  the  real  plane  which  interpolates  the  gray 
values  found  in  the  discrete  image.  Hie  metric  t^j^rs  is  de¬ 
fined  as 

fci j  ,rs  ■  {  *  «  (ID 

where  the  integration  path  lies  between  P(i,j)  and  P(r,s). 

In  the  case  of  binary  images,  the  first  of  Montanari' s 
algorithms  converges  in  two  iterations.  Now,  however,  the  first 
method  cannot  be  assumed  to  converge  in  two  iterations.  The 
algorithm  is  modified  in  the  following  manner.  We  now  assume 
that  the  skeleton  is  internal  to  the  object  instead  of  exter¬ 
nal  to  it  as  before.  Hence,  we  place  zeroes  at  all  the  vertices 
external  to  the  image  and  blanks  at  the  remaining  vertices. 
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Then  each  element  is  considered  sequentially  and  a  new  value 
is  computed  from  the  following  formula: 

bij  *  min  <bij'  brs  +  tij,rs) 

fPrs'l 

where  [ Prs 1 ]  is  the  set  of  vertices  directly  connected  to  P^j 
for  which  new  fc>rs  have  already  been  computed,  as  before.  The 
method  is  applied  with  forward  raster  sequence  alternating 
with  backward  raster  sequence  until  the  image  is  unchanged  in 
both  directions. 

The  computation  of  t^  j  ^rg  can  involve  considerable  work 
depending  upon  the  interpolating  function  chosen.  For  the  step 
function,  the  work  is  easy  since 

bij,rs  =  (1/2) (a^j  +  ars)  .  j  *  rrs  (13) 

where  t;  +  __  is  the  true  length  of  the  arc  connecting  P-s-i'  and 

p  ' 

^rs  * 

The  skeleton  points  are  those  for  which 


b 


ij 


> 


max  (0,  brs 
(prsl 


fcij »rs  > 


(14) 


where  [Prg]  is  the  set  of  all  the  vertices  directly  connected 

tO  Pij. 

Figures  4.6,  4.7,  and  4.8,  taken  from  [23],  show  the 
method  applied  to  a  digitized  image  of  a  human  chromosome. 

Montanari  and  Levi  do  not  explicitly  describe  the  altera¬ 
tions  necessary  to  the  second  algorithm  in  order  to  compute 
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Figure  4.6.  Human  Chromosome  (from  [23]). 
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Figure  4.7.  Black  and  White  Skeleton  of  Human  Chromosome;  n=2; 
K-0.80  (from  [23]. 
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»  •  • 


Figure  4.8.  Gray-Weighted  Skeleton  of  Human  Chromosome;  n=2;  K=0. 
(from  [23] . 
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gray  weighted  Mat's,  but  the  alterations  are  obvious,  since  the 
only  thing  that  is  changed  is  to  compute  the  distances  in 
accordance  with  the  metric  defined  in  (11)  instead  of  that  de¬ 
fined  in  the  explanation  of  (3a)  and  (3b). 

One  thing  that  needs  to  be  made  clear  at  this  point  is 
that  the  b^  are  taken  from  the  previously  computed  image  at 
each  pass  except  the  first,  and  that  the  a^j  are  always  taken 
from  the  original  image.  It  is  easy  to  see  that  if  the  a^j 
and  b—  are  always  taken  from  the  same  image,  then  convergence, 
at  least  in  the  case  of  the  step  interpolating  function  (13), 
cannot  occur  in  a  finite  number  of  steps  (Zeno's  paradox 
revistedl ) . 


A  Grayscale  Thinning  Algorithm 

Dyer  and  Rosenfeld  [36]  have  devised  the  following  algo¬ 
rithm  for  thinning  grayscale  images.  The  algorithm  is  based 
upon  a  grayscale  scheme  in  which  0  is  the  maximum  intensity 
value,  i.e.,  the  larger  the  grayscale  value,  the  darker  the 
pixel  where  a  white  pixel  is  assigned  value  0. 

Using  the  pixel  neighborhood  defined  above,  Dyer  and 
Rosenfeld  define  the  range,  R,  in  the  neighborhood  as 

R  =  max ( B,D,E,F,H)  -  min( B,D, E, P,H )  +  1  (15) 

and  choose  an  appropriate  fraction,  f,  (0£f£l)  such  that  [ f R] 

=  r'  ,  0  <  R<  R.  Then  the  conditions  which  must  be  met  for  re¬ 
placement  of  the  element  under  E  are: 

(1)  at  least  two  of  B,  D,  F,  H  have  values  <.  E  -  R'; 
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(2)  for  each  of  the  six  pairs  of  B,  D,  F,  H,  let  m  be  the 
pair's  minimum;  then  either  E  m  -  R'  or  there  is  a 
path  on  the  perimeter  of  the  mask,  i.e.,  not  involv¬ 
ing  E,  such  that  for  every  point  p  on  the  path, 
p  £  m  -  R' . 

Iff  conditions  (1)  and  (2)  are  met,  then  E  is  replaced  by 
the  minimum  of  itself  and  2  members  of  B,  D,  F,  H.  The  choice 
of  the  2  members  of  B,  D,  F,  H  is  what  determines  the  relation¬ 
ships  of  the  objects  in  the  thinned  image  to  the  objects  in  the 
original  image.  Dyer  and  Rosenfeld  leave  this  choice  to  the 
user.  In  the  case  at  hand,  it  is  desirable  to  skeletonize  the 
objects  in  the  image,  so  the  choice  is  made  as  follows.  Set  a 
flag,  then  scan  the  image  by  sliding  the  mask  over  the  image 
in  normal  raster  scan  beginning  with  element  (2,2)  under  E  and 
ending  with  element  (m-l,n-l)  under  E.  At  each  point  perform 
tests  (1)  and  (2).  On  the  first  pass  through  the  image  and  on 
all  succeeding  odd  passes,  replace  E  with  the  minimum  of  B, 

D,  E.  On  each  even  pass,  replace  E  with  the  minimum  of  E,  F,  H. 
When  E  is  replaced,  the  replacement  is  written  into  a  copy 
image  which  is  created  at  the  beginning  of  each  pass. 

The  tests  (1)  and  (2)  are  made  on  the  original  image  in 
the  first  pass,  and  on  the  result  of  the  previous  pass  on  all 
succeeding  passes.  When  E  is  replaced,  the  flag  is  reset.  The 
procedure  stops  when  a  pass  has  been  made  and  no  E  is  replaced, 
that  is,  the  flag  is  found  set  at  the  end  of  a  pass. 

The  performance  of  this  algorithm  is  dependent  upon  the 
range  of  grayscale  values  in  the  image  and  upon  the  choice  of 


threshold 
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Computation  of  a  Grayscale  MAT  by  Thinning 

To  convert  the  grayscale  thinning  algorithm  of  Dyer  and 
Rosenfeld  to  compute  the  MAT  of  a  grayscale  image,  all  that 
has  to  be  done  is  to  build  a  list  as  was  done  earlier  in  the 
case  of  the  binary  thinning  algorithm.  In  this  case,  there  are 
two  ways  of  defining  the  skeleton  points:  in  the  first,  an 
element  is  a  member  of  the  skeleton  iff  two  of  its  neighbors 
B,  D,  F,  H  have  values  <  E  -  R';  in  the  second,  an  element  is 
a  member  of  the  skeleton  iff  two  of  its  neighbors  B,  D,  F,  H 
are  less  than  E.  The  pass  number,  chain  code  scheme,  etc. , 
are  no  longer  important  because  grayscale  images  cannot  be 
reconstructed  from  their  MAT'S. 

The  results  of  this  algorithm  are  compared  to  the  algo¬ 
rithm  of  Levi  and  Montanari  in  Figure  4.9. 

Conclusions 

The  Medial  Axis  Transform  (MAT)  of  discrete  binary  and 
grayscale  images  can  be  computed  by  two  different  techniques, 
i.e.,  thinning  and  distance  measurement,  both  of  which  simulate 
the  "prairie  fire"  concept.  The  results  of  the  two  techniques 
differ,  and  it  appears  that  the  distance  measurement  techniques 
requires  fewer  operations,  that  is,  it  converges  in  fewer  itera¬ 
tions.  Moreover,  the  skeletons  from  the  distance  measurement 
technique  appear  to  be  more  complete. 

The  thinning  technique  corresponds  to  the  N  =  0  case  for 
the  distance  measurement  technique.  Although  it  is  possible 
to  define  more  complex  thinning  algorithms,  it  does  not  appear 
to  be  a  fruitful  area.  The  capability  of  the  distance  measure- 
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ment  technique  to  be  extended  to  higher  order  is  a  definite 
advantage  in  cases  where  the  fine  structure  of  the  skeleton 
becomes  important.  A  possible  example  of  this  could  be  in 
identifying  a  certain  chromosome  which  closely  resembles  other 
chromosomes,  etc. 

Based  on  these  findings,  it  appears  that  the  distance 
measurement  technique  is  preferable  to  the  thinning  technique 
and  that  future  work  should  concentrate  on  the  application  of 
the  distance  measuring  technique  to  pattern  classification. 


J 
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CHAPTER  V 


LINE  CODING  TECHNIQUES  AND  APPLICATIONS  TO 
AIRPORT  PATTERN  CLASSIFICATION 

Another  aspect  of  digital  image  processing  that  has  re¬ 
ceived  much  attention  in  the  recent  literature  is  the  subject 
of  image  coding.  The  underlying  objective  of  image  coding  has 
been  the  representation  of  an  image  with  as  few  binary  digits 
(bits)  as  possible,  under  the  constraint  that  the  resulting  image 
contains  some  minimum  level  of  fidelity  (also  called  minimizing 
distortion).  This  portion  of  the  report  surveys  the  subject  of 
image  coding  in  some  detail,  in  addition  to  presenting  several 
specific  coding  schemes  and  illustrating  each  with  an  example. 

Before  presenting  the  two  categories  (or  classes)  of  image 
coding  schemes  that  will- be  covered  in  this  report,  it  is  neces¬ 
sary  to  define  the  meaning  of  the  word  "image."  Throughout  this 
coding  discussion,  the  word  "image"  is  to  be  taken  as  being 
synonymous  with  the  words  "digitized  (i.e.,  quantized)  line 
drawing."  A  line  drawing  is  in  turn  defined  as  a  two-dimensional 
representation  of  a  digitized  two-dimensional  image  in  which  only 
the  boundaries  (between  each  region  in  the  image  that  possesses 
a  constant  gray  level)  are  drawn  (or  "traced").  In  other  words, 
a  line  drawing  merely  preserves  the  boundaries  formed  by  the 
different  gray  levels  that  appear  within  a  quantized  image. 
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For  the  purpose  of  the  subsequent  coding  discussion,  then, 
an  image  is  taken  to  mean  a  two-dimensional  line  drawing  that 
has  undergone  a  quantization  process.  The  degree  (or  "fineness") 
of  this  process  is  an  important  parameter  in  any  digital  image 
processing  system,  as  it  influences  both  the  computer  storage 
and  processing  requirements  of  such  a  system.  For  if  an  image 
is  quantized  very  finely,  then  a  large  amount  of  computer 
memory  and  processing  will  be  required;  whereas,  if  an  image 
is  quantized  too  coarsely,  computer  memory  and  processing  will 
be  reduced,  at  the  expense  of  a  loss  in  image  detail.  Thus, 
such  quantization  should  be  as  coarse  as  possible,  while  pre¬ 
serving  those  features  of  the  original  image  that  have  been 
deemed  significant  [44]  .  The  interested  reader  is  referred  to 
references  [40]  and  [44]  for  a  more  detailed  discussion  of  the 
quantization  of  continuous  line  drawings. 

This  portion  of  the  report  presents  a  survey  of  several 
schemes  for  the  encoding  of  quantized  line  drawings.  While  the 
impetus  for  this  discussion  of  image  encoding  was  at  least 
partially  explained  in  the  opening  paragraph  of  this  section, 
the  reasoning  behind  the  subsequent  definition  of  an  image  as  a 
quantized  line  drawing  may  not  as  yet  be  apparent  to  the  reader. 
First  of  all,  given  that  any  modern  image  processing  system  em¬ 
ploys  one  or  more  digital  computers,  then  some  analog-to-digital 
(A/D)  process  is  required  in  order  to  convert  the  original 
analog  (i.e.,  continuous- tone)  image  into  its  equivalent  digital 
(i.e.,  discrete)  representation.  This  A/D  process  is  referred 
to  in  the  literature  as  quantization.  Secondly,  many  image 


processing  systems  seek  to  recognize  and  classify  particular 
objects  (whether  natural  or  man-made)  of  interest  to  the  user. 
This  latter  process  is  referred  to  as  pattern  recognition  in 
the  literature  (e.g,,  references  [11],  [18],  [31].  From  the 
point  of  view  of  pattern  recognition  via  computer,  line  drawings 
have  represented  an  important  category  of  image  representation 
[44] .  It  is  for  these  reasons  that  we  focus  on  the  encoding 
of  quantized  line  drawings. 

Coding  Categories 

The  various  image  coding  schemes  that  will  be  discussed 
in  this  report  can  be  divided  into  two  categories: 

-  Line-by-Line  Encoding 

-  Two-Dimensional  Encoding 

The  first  category,  that  of  line-by-line  encoding,  involves  the 
encoding  of  a  digitized  image  on  a  per-scan-line  basis.  In 
other  words,  any  encoding  scheme  within  this  category  operates 
on  each  scan  line  independently  of  all  others  in  the  image. 

The  following  line-by-line  encoding  schemes  will  be  discussed 
subsequently: 

-  Run-Length  Encoding 

-  Differential  Encoding 

The  second  category,  that  of  two-dimensional  encoding,  contains 
less  restrictive  encoding  schemes  in  which  each  scan  line  is 
not  encoded  independently,  but  rather  as  a  function  of  one  or 
more  of  its  adjacent  scan  lines.  Thus,  this  latter  category. 


unlike  the  first,  inherently  assumes  that  a  given  digitized 
image  cannot  be  considered  to  consist  of  (either  horizontal  or 
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vertical)  scan  lines  that  are  independent  of  each  other.  The 
following  two-dimensional  encoding  schemes  will  be  detailed 
later  in  this  report: 

-  Chain  Encoding 

-  Chain-Difference  Encoding 

-  Line  Adjacency  Graph  Encoding 

-  Region  Adjacency  Graph  Encoding 

A  detailed  discussion  of  and  an  illustrative  example  for  each 
of  the  aforementioned  encoding  schemes  are  presented  in  the 
following  paragraphs. 


One-Dimensional  Methods 

Run- length  Coding 

As  mentioned  previously,  run-length  encoding  falls  under 
the  category  of  line-by-line  encoding.  Run  length  encoding  is 
a  relatively  simple  scheme  in  which  the  amplitudes  of  adjacent 
image  points  along  a  scan  line  are  compared  [30],  The  run 
length  is  the  number  of  juxtaposed  image  elements  along  a  scan 
line  having  the  same  amplitude  (i.e.,  gray  level).  Thus,  for 
a  given  scan  line  whose  image  points  are  denoted  X^,  X2, 
X3,...,XN,  this  scheme  maps  this  sequence  into  a  sequence  of 
integer  pairs  (g^,  l^) ,  where  l k  equals  the  number  of  conse¬ 
cutive  image  points  of  amplitude  g^. 

The  details  regarding  run  length  encoding  are  described 
in  [47]  and  summarized  as  follows.  For  the  first  scan  line  of 
an  image,  the  amplitude  of  its  leftmost  pixel  is  set  equal  to 
g^  and  l ^  is  set  equal  to  the  length  of  the  run  of  points  with 
this  same  amplitude  g^.  Then  at  the  first  amplitude  transition. 
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g2  is  set  equal  to  the  amplitude  of  the  second  run,  with  f2 
equal  to  the  length  of  this  second  run.  This  procedure  is 
repeated  until  the  end  of  the  first  scan  line,  after  which  suc¬ 
ceeding  scan  lines  are  similarly  encoded.  It  is  evident  then 
that,  for  any  given  scan  line,  the  number  of  runs  can  vary  from 
one  (in  which  case  every  pixel  has  the  same  amplitude)  to  N 
(when  no  two  adjacent  points  have  the  same  amplitude).  Corres¬ 
pondingly,  the  run  lengths  can  vary  from  N  to  one.  An 
illustrative  example  of  run  length  encoding  is  presented  in 
Figure  5.1. 

Differential  Coding 

Another  line-by-line  encoding  scheme  mentioned  earlier 
was  that  of  differential  encoding.  The  motivation  for  this 
scheme  is  the  fact  that  since  adjacent  image  points  along  a 
scan  line  are  highly  correlated  in  most  pictures,  there  exists 
a  high  degree  of  redundancy  between  such  pairs  of  points  [30] . 
In  differential  encoding,  only  the  differences  in  amplitudes 
of  adjacent  image  points  are  encoded  along  a  scan  line,  after 
having  encoded  the  actual  amplitude  of  the  first  point  in  that 
line.  Thus,  only  the  relative  amplitude  of  each  successive 
point  after  the  first  point  in  each  scan  is  encoded  in  this 
scheme.  However,  a  little  thought  would  reveal  that  there  is 
no  real  difference  between  run  length  encoding  and  differential 
encoding  in  terms  of  the  encoding  of  quantized  line  drawings, 
as  their  points  can  be  one  of  only  two  values:  black  or  white. 
Thus,  differential  encoding  will  not  be  discussed  further  in 
this  report. 
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a)  Continuous  Line  Drawing 


b)  Method  of  quantizing  line  drawing  shown  in  a) 


Scan  Line  1 - *  «  •  «  » 

Scan  Line  2 - ►  *  * 

Scan  Line  3 - ►  * 

Scan  Line  4 - ►  * 

Scan  Line  5 - ¥  ••••••• 


c)  Quantized  line  drawing  corresponds  to  a) 
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d)  Code  resulting  from  the  Run  Length  Encoding  of  c) 


Figure  5.1.  Continuous  Line  Drawing  and  its  Quantized 
Regenerations 
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Two-Dimensional  Methods 

just  as  it  was  true  for  the  line-by-line  encoding  schemes 
discussed  above,  there  are  similarities  between  the  four  two- 
dimensional  encoding  schemes  that  will  be  covered  in  this  report 
Associated  with  two  of  these  schemes  (namely  chain  encoding  and 
chain-difference  encoding)  is  a  particular  quantization  scheme, 
called  grid- intersect  quantization  [45],  that  will  first  be  des¬ 
cribed  in  the  following  paragraph. 

Imagine  that  a  uniform,  rectangular  grid  is  first  super¬ 
imposed  on  the  continuous  line  drawing  that  is  to  be  encoded. 

A  quantized  version  of  this  line  drawing  can  then  be  derived 
via  a  list  of  the  coordinates  of  those  grid  intersections  which 
come  closest  to  lying  on  the  continuous  line  drawing  [50].  Thus, 
one  representation  of  this  line  drawing  is  simply  the  list  of 
these  "line  drawing  points."  However,  it  is  not  necessary  to 
store  this  entire  list  of  X-Y  coordinates.  Instead,  each  suc¬ 
cessive  point  in  this  list  can  be  encoded  relative  to  its 
previous  point.  The  two  aforementioned  two-dimensional  encod¬ 
ing  schemes  are  two  specific  methods  of  performing  this  encoding 
and  each  scheme  is  described  below. 

Chain  Coding 

Assuming  that  a  given  continuous  line  drawing  has  been  quantized 
via  the  grid-intersect  quantization  method,  then  the  chain  en¬ 
coding  of  the  resulting  quantized  line  drawing  proceeds  as 
follows.  First,  a  point  is  chosen  as  the  "starting  point." 

The  line  segment  from  this  starting  point  to  the  next  point  in 
the  quantized  line  drawing  can  then  be  uniquely  described  (due 
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to  the  grid- intersect  quantization)  by  the  angle  (measured 
clockwise)  that  this  line  segment  forms  with  the  positive  x-axis 
Thus,  the  chain  encoding  scheme  represents  this  segment  (and 
every  successive  line  segment  that  makes  up  the  quantized  line 
drawing)  by  one  of  the  following  eight  single-digit  numbers: 


Chain-Encoded  Angle  Between  Segment 

Representation  _ and  x-axis _ 


0 

1 

2 

3 

4 

5 

6 
7 


0° 

45° 

90° 

135° 

180° 

225° 

270° 

315° 


An  example  that  illustrates  both  the  grid-intersect  quantization 
and  this  chain  encoding  scheme  is  shown  in  Figure  5.2,  and  fur¬ 
ther  discussion  of  this  scheme  can  be  found  in  [43]  ,  [45] ,  and 
[50]  . 

Chain-Difference  Coding 

Chain-difference  encoding  is  a  modified  version  of  chain 
encoding  in  which  the  difference  in  the  angle  between  successive 
line  segments  is  encoded  (i.e.,  the  relative  angle  between  any 
two  adjoining  segments),  rather  than  its  actual  value.  Chain- 
difference  encoding  makes  sense  because,  assuming  that  the 
quantization  on  the  continuous  line  drawing  was  sufficiently 
fine  so  as  to  preserve  all  the  detail  of  interest,  then  it 
follows  that  the  (relative)  difference  in  angle  between  ad¬ 
joining  line  segments  will  ordinarily  not  exceed  ±45°,  whereas 
turns  of  ±90°  will  be  infrequent,  and  those  exceeding  90°  will 
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a)  Continuous  Line  Drawing 


b)  Quantizing  a)  via  Grid-Intersect  Quantization 


Starting  Point-*  < 


c)  Quantized  line  Drawing  corresponding  to  a) 


5'  1  '  7 


d)  Chain  Encoding  Scheme  to  be  utilized  on  c) 


e)  Code  resulting  from  chain  encoding  c)  via  d) 


Figure  5.2.  Grid-Intersect  Quantization  and  Chain  Encoding 
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be  fairly  rare  [45].  Chain-difference  encoding  takes  advantage 
of  this  information  by  representing  the  small  angle  differences 
by  a  short  code,  while  using  successively  longer  codes  to  repre¬ 
sent  the  larger  possible  angle  differences  (analogous  to  Huffman 
Codes,  which  are  detailed  in  [46].  Thus,  it  can  be  shown  that, 
in  general,  chain-difference  encoding  requires  only  slightly 
more  than  two- thirds  of  the  storage  required  by  chain  encoding 
[45]  . 

Both  chain  encoding  and  chain  difference  encoding  are  two 
of  the  two-dimensional  encoding  schemes  in  which  the  input  image 
is  basically  considered  to  be  a  linear  and  ordered  list  of  ele¬ 
ments  [50].  But  another  methodology  for  encoding  continous  line 
drawings  is  to  consider  them  as  a  set  of  interrelated  objects  or 
regions.  The  remaining  two  encoding  schemes,  that  of  line 
adjacency  graph  encoding  and  region  adjacency  graph  encoding, 
implement  this  methodology.  Both  of  these  schemes  are  detailed 
in  [50]  from  which  the  following  discussion  is  taken. 

Line  Adjacency  Graph 

A  line  drawing  consists  of  black  lines  on  a  white  back¬ 
ground  (or  vice-versa).  The  line  adjacency  graph  encoding 
scheme  is  similar  to  the  run-length  encoding  scheme  described 
previously  in  that  both  schemes  successively  scan  each  line 
that  constitutes  the  quantized  line  drawing  in  order  to  record 
where  within  each  scan  line  there  is  a  change  (called  a  break 
point)  from  black  to  white  or  white  to  black.  However,  the 
line  adjacency  graph  encoding  scheme  partitions  each  scan  line 
into  segments  at  these  break  points.  As  the  line  drawing  is 
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scanned,  a  graph  structure  called  a  line  adjacency  graph  is 
constructed  as  follows.  The  "nodes"  of  this  graph  consist  of 
those  segments  (or  individual  points)  that  constitute  a  part 
of  the  line  drawing.  Any  two  nodes  of  this  graph  are  then 
"connected"  if  the  following  conditions  are  satisfied: 

(1)  If  the  segments  (or  points)  represented  by  the  two 
nodes  lie  on  adjacent  scan  lines, 

(2)  If  the  segments  (or  points)  also  overlap  when  pro¬ 
jected  onto  a  single  line. 

An  illustrative  example  of  this  scheme  is  shown  in  Figure  5.3 

It  is  evident  from  the  preceding  discussion  that  the  use¬ 
fulness  of  this  line  adjacency  encoding  scheme  lies  in  the  fact 
that  it  retains  the  two-dimensional  structure  of  the  line  draw¬ 
ing.  Thus,  this  scheme  is  more  powerful  than  any  line-by-line 
encoding  scheme,  which  preserves  left-to-right  relations  but 
not  up-and-down  relations. 

Region  Adjacency  Graph 

A  related  structure  that  is  also  useful  in  segmentation 
of  a  line  drawing  is  called  the  region  adjacency  graph  [50] . 

In  this  type  of  graph,  the  nodes  represent  regions  shown  in  the 
line  drawing,  and  two  nodes  are  connected  if  the  regions  repre¬ 
sented  by  these  nodes  are  adjacent.  An  illustrative  example  of 
this  region  adjacency  graph  encoding  scheme  is  presented  in 
Figure  5.4. 


Comparison  of  Types 

At  this  point  the  reader  should  not  only  be  familiar  with 
several  specific  image  encoding  schemes,  but  should  also  have 


a)  Continuous  line  drawing  of  the  letter  W. 

Scan  Line  1 
Scan  Line  2 
Scan  Line  3 
Scan  Line  4 
Scan  Line  5 
Scan  Line  6 

b)  Quantized  representation  of  a) 


For  Scan  Line  1: 

For  Scan  Line  2: 

For  Scan  Line  3: 

For  Scan  Line  4: 

For  Scan  Line  5: 

For  Scan  Line  6: 

c)  Line  Adajacency  Graph  Encoding  of  b) 

Figure  5.3.  Line* Ad jacency  Encoding 
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a  better  insight  into  why  these  schemes  were  divided  into  two 
fairly  distinct  categories:  line-by-line  encoding  vs  two- 
dimensional  encoding.  So  now  an  interesting  question  arises: 

Does  either  of  these  categories  have  any  inherent  advantages 
over  the  other?  This  question  is  addresssed  both  from  a  prac¬ 
tical  and  a  more  theoretical  aspect  in  the  following  paragraphs. 

From  a  practical  standpoint,  those  encoding  schemes  that 
fall  under  the  category  of  line-by-line  encoding  are  superior 
to  those  that  were  categorized  under  two-dimensional  encoding. 

The  reason  for  this  lies  in  the  fact  that,  almost  invariably, 
one  of  the  first  steps  that  any  digital  image  processing  system 
implements  is  to  convert  an  input  two-dimensional  image  into  a 
series  of  independent  scan  lines  (either  horizontal  or  verti¬ 
cal)  .  Subsequent  encoding  of  each  of  these  independent  scan 
lines  via  a  particular  line-by-line  encoding  scheme  yields  a 
considerable  reduction  in  computer  processing  complexity  and 
storage  requirements  over  those  of  two-dimensional  encoding  [49] . 

Before  comparing  these  two  categories  of  image  encoding 
schemes  from  a  more  theoretical  standpoint,  a  brief  discussion 
of  rate  distortion  theory  is  necessary.  This  theory  centers 
around  a  concept  called  the  rate  distortion  function,  which  will 
be  introduced  via  the  simplified  block  diagram  of  a  communica¬ 
tions  system  shown  in  Figure  5.5.  The  following  development 
is  taken  from  [41]  . 


Rate  Distortion  Theory 


Given  the  source  (e.g.,  an  image)  shown  in  Figure  5.5, 
the  communications  systems  engineer  is  confronted  with  the 


89 


problem  of  encoding  this  source  in  such  a  way  that  the  channel 
capacity  requirement  for  transmission  is  minimized.  (The 
"capacity"  of  the  channel  of  Figure  5.5  will  be  defined  short¬ 
ly).  To  achieve  this,  he  is  willing  to  tolerate  some  average 
distortion  between  the  source  output  and  the  decoder  output. 
The  problem  addressed  by  the  subject  of  rate  distortion  theory 
is  the  minimization  of  this  channel  capacity  requirement  while 
maintaining  this  average  distortion  at  or  below  an  acceptable 
level  (as  defined  by  the  user). 

To  be  more  specific,  let  the  average  information  trans¬ 
mitted  from  the  source  output  (labeled  X  in  Figure  5.5)  to  the 
decoder's  output  (labeled  by  Y  in  this  figure)  be  denoted  by 
a  function  I(X,Y).  Further,  assume  that  more  information 
corresponds  to  larger  values  of  this  function  I(X,Y).  Now  let 
the  analogous  average  information  transmitted  from  the  source 
encoder  output  (labeled  U)  to  the  channel  output  (labeled  V) 
be  denoted  I(U,V).  The  "capacity"  herein  denoted  by  C  of  the 
channel  shown  in  Figure  5.5  can  now  be  defined  as  the  maximum 
of  I ( U , V)  over  all  possible  input  devices.  It  is  then  evident 
via  Figure  5.5  that  the  intervening  nature  of  U  and  V  implies 
that: 

I (X , Y )  <  I(U,V)  <  C 

Establishing  that  it  is  possible  for  I(X,Y)  to  be  arbitrarily 
close  to  C  for  an  arbitrary  source  and  channel  is  not  so  easily 
performed,  and  the  interested  reader  is  referred  to  references 
[39]  ,  [41]  ,  [46]  ,  and  [49]  for  further  details.  Suffice  it  to 
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say  that  the  rate  distortion  function,  usually  denoted  R(D), 
is  the  minimum  value  of  I(x,Y)  for  a  given  distortion  level  D, 
and  only  for  values  of  R(D)  less  than  C  can  the  communications 
engineer  be  insured  of  obtaining  a  distortion  level  D  with  the 
system  of  Figure  5.5.  In  other  words,  this  rate  distortion 
function  R(D)  is  the  minimum  source  encoder  output  rate,  and 
hence  minimum  channel  capacity,  required  for  a  given  average 
distortion  level  D. 

The  impetus  for  this  introduction  of  the  rate  distortion 
function  will  now  become  apparent.  It  should  be  evident  from 
the  above  discussion  that  a  rate  distortion  function  could  be 
derived  for  a  system  of  Figure  5.5  employing  a  line-by-line 
source  encoding  scheme.  Similarly,  another  rate  distortion 
function  could  be  derived  for  an  identical  system  employing  a 
two-dimensional  source  encoding  scheme.  A  comparison  of  the 
resulting  rate  distortion  functions  could  then  be  utilized  to 
determine  which  of  these  two  schemes  has  a  theoretical  advan¬ 
tage  over  the  other. 

Generalized  rate  distortion  functions  for  both  the  line- 
by-line  encoding  and  two-dimensional  encoding  of  a  particular 
two-dimensional  image  (that  is  homogeneous  and  isotropic)  are 
each  derived  in  [49],  However,  the  resulting  pair  of  rate  dis¬ 
tortion  functions  is  found  to  be  too  complex  to  permit  a  direct 
comparison  between  them.  Thus,  [49]  proceeds  to  numerically 
evaluate  these  two  expressions  for  a  particular  example.  The 
results  of  this  evaluation  can  be  summarized  [49]  as  follows: 


( 


( 1)  Two-dimensional  encoding  permits  a  decrease  in  coding 
rate  by  a  factor  of  2  to  3  over  that  of  line-by-line 
encoding,  and 


(2)  In  line-by-line  encoding,  the  selection  of  the  line- 
width  between  consecutive  scan  lines  is  crucial  in 
achieving  an  efficient  coding  rate. 

While  the  authors  of  reference  [49]  achieved  these  results  via 
a  number  of  simplifying  assumptions,  they  feel  that  their  con¬ 
clusions  (stated  above)  are  indicative  of  the  differences  for 
these  two  categories  of  encoding  complexity.  Furthermore,  a 
later  and  independent  evaluation  by  another  author  was  in 
approximate  agreement  with  their  first  conclusion  above  (see 
[41]).  Thus,  from  a  theoretical  standpoint,  it  appears  that 
any  two-dimensional  encoding  scheme  is  superior  to  any  line-by- 
-line  encoding  scheme. 


Line  Drawing  Encoding  -  An  Airport  Example 
Introduction 

The  notion  that  precipitated  the  interest  in  image  coding 
was  the  proposition  that,  given  a  complex  image,  there  exists 
some  series  of  preprocessing  followed  by  suitable  encoding  that 
may  result  in  a  symbol  stream  that  exhibits  a  regonizable  pat¬ 
tern  characteristic  of  an  identifiable  object.  The  system 
proposed  in  the  notion  is  shown  in  Figure  5.6.  A  variety  of 
image  encoding  schemes  has  been  proposed  in  the  literature. 

Some  of  these  are  listed  in  Table  5.1.  Other  techniques  that 
are  often  used  in  the  image  processing  context,  some  or  which 
may  be  considered  an  encoding  techniques,  are  shown  in  Table  5.2. 


Table  5.1 

A  Sampling  of  Available  Image  Encoding  Techniques* 


Error-Free  Encoding  Schemes 

•  Difference  Encoding  with  Shift  Coding 

•  contour  Encoding 

•  Gray  Code 

•  Huffman  Code 

•  Shift  Code 

•  Run- Length  Encoding 

-  One-Dimensional 

-  Two-Dimensional 

Error- Friendly  Encoding 

•  Differential  Pulse  Code  Modulation  (DPCM) 

•  Transform  Encoding 

•  Hybrid  Encoding 

•  Hotelling  Transformation 


i 
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Table  5.2 

Symbolic  Image  Description  Techniques 

•  Connectivity  (Pixel  Connectivity) 

•  Line  Description 

-  Curve  Fitting 

-  Line-to-Point  Transformation 

•  Shape  Description 

-  Metric  Attributes  Area 

Length-to-Width  Ratio 
Perimeter 

Perimeter  to  Area  Ratio 

-  Topological  Attributes  Euler  Number 

Connectivity 

-  Analytical  Attributes  Polar/ Fourier  Curvature 

Function 

Moment  Approximation 

Associated  Tools  and  Techniques: 

•  Shrinking,  Thinning,  and  Skeletonizing 

•  Amplitude  Segmentation 

-  Luminance  Thresholding 

-  Multidimensional  Thresholding  (Color) 

-  Regional  Growing 

•  Edge  Segmentation 

-  Curve  Fitting 

-  Contour  Following 

-  Edge  Point  Linking 

•  Texture  Segmentation 

•  Shape  Segmentation 
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Encoding  as  an  Identification  Tool 
Used  in  its  most  familiar  context,  encoding  is  a  process 
whereby  information  in  one  format  is  converted  into  another  for¬ 
mat  as  a  necessary  step  in  some  transmission  or  storage  process. 
In  this  context  the  objective  of  the  encoding  scheme,  as  noted 
earlier,  is  to  convey  the  original  information  content  with 
minimum  loss  of  information  and  added  extraneous  information, 
while  using  the  minimum  number  of  encoding  symbols.  In  the 
image  identification  problem  the  objective  is  quite  different. 
Here  the  goal  is  to  choose  an  encoding  scheme  that  preserves 
only  enough  information  to  identify  those  key  attributes  that 
uniquely  identify  the  target  or  interest.  "How  do  the  two 
situations  differ?",  one  may  ask.  In  the  one  case  (the  trans¬ 
mission  or  data  storage  example)  the  observer  may  be  required 
to  distinguish  details  or  attributes  that  cannot  be  established 
a  priori.  Such  an  example  may  be  a  specific  human  face  as 
opposed  to  a  general  decision  that  a  human  face  is  indeed  being 
displayed.  In  the  latter  case,  as  in  this  study,  the  objective 
is  to  identify  an  object  that  belongs  to  a  specific  class  of  ob¬ 
jects.  While  the  individual  members  of  the  class  may  exhibit 
characteristics  that  set  themselves  far  apart  from  other  members 
of  the  class,  they  all  share  one  or  more  common  characteristics 
that  we  use  to  combine  them  in  description  as  a  common  entity. 
These  characteristics  are  the  only  characteristics  necessary 
to  encode  if  an  encoding  scheme  is  indeed  to  be  useful  as  a 
tool  in  the  image  identification  process.  In  addition,  target 
information  that  merely  details  the  class  member  or  simply  makes 
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1 

*  it  somewhat  different  from  another  member  of  the  class  is  ex- 

*  - 

traneous  to  the  target  identification  process.  In  the  encoded 
form  the  resultant  encoded  version  of  the  image  may  therefore 
contain  information  extraneous  to  the  target  identification 
process.  This  information  will  serve  only  to  increase  the  pro¬ 
cessing  required  to  complete  the  pattern  recognition  process. 

The  objective,  of  course,  is  to  choose  a  preprocessing  scheme 
and  particularly  an  encoding  algorithm  that  suppresses  the 
noise  (image  detail)  and  retains  or  emphasizes  only  that  infor¬ 
mation  relative  to  the  attribute  or  attributes  that  uniquely 
identify  the  target  as  a  member  of  the  target  class  of  interest. 

Of  course  some  encoding  schemes  may  be  appropriate  to  the 
situation  described  above.  It  is  proposed  here  that  the  validity 
of  the  proposition  is  a  function  of  the  nature  of  the  class  and 
the  level  of  specificity  of  the  definition  of  the  class.  Two 
specific  example  targets  will  be  proposed  in  the  following  dis¬ 
cussion.  In  both  cases  Freeman  chain  encoding  will  be  proposed 
as  an  encoding  scheme.  In  the  first  case,  the  scheme  appears 
to  be  an  appropriate  and  quite  valuable  technique  to  aid  in  a 
straightforward  implementation  of  the  identification  process 
of  Figure  5.6.  In  the  second  case,  quite  the  opposite  is  true 
due  to  the  level  of  variation  of  the  members  of  the  class. 

In  one  proposed  scheme  for  target  identification  perhaps 
one  or  more  of  the  tools  and  techniques  like  those  identified 
in  Table  5.2  would  be  required  to  preprocess  an  image  to  develop 
^  a  candidate  subimage.  The  candidate  subimage  would  perhaps  be 

further  preprocessed  by  applying  an  edge-detection  algorithm. 
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All  of  these  preprocessing  steps  would  be  carried  out  sequen¬ 
tially  in  the  first  process  shown  in  Figure  5.6.  In  the  subse¬ 
quent  encoding  process  shown  in  that  Figure  a  suitable  encoding 
technique  would  be  applied  to  the  preprocessed  image.  In  the 
first  example,  suppose  that  the  raw  image  plane  contains  the 
top  view  of  a  small  isolated  building  as  a  high  illuminance 
rectangle  of  relatively  small  size  contrasted  against  a  dark 
background  representing  a  highly  vegetated  region.  Preproces¬ 
sing  by  luminance  thresholding  followed  by  edge  detection  would 
reveal  an  image  composed  of  a  rectangle  line  drawing.  Encoding 
this  line  drawing  by  either  a  raster  scan  or  Freeman  encoding 
technique  would  result  in  a  relatively  simple  symbolic  descrip¬ 
tion  of  the  rectangular  shape.  Figure  5.7  illustrates  the 
Freeman  chain-encoded  symbol  stream.  Based  directly  on  the 
symbol  stream  both  the  shape  (rectangular)  and  area  can  be 
determined  directly,  as  shown  in  the  final  process  of  Figure 
5.6.  If  these  attributes  are  sufficient  to  imply  an  isolated 
building  structure  (based  perhaps  on  a  priori  knowledge  of  the 
types  of  targets  expected  in  the  image)  then  the  small  rectangu¬ 
lar  area  can  be  positively  identified  immediately  as  an  isolated 
building  structure. 

Figure  5.8  illustrates  a  sub image  obtained  from  one  of  the 
aerial  photographs  made  available  under  the  research  contract. 
Superimposed  over  the  photograph  is  a  clear  rectangular  grid 
used  in  manually  performed  encoding  operations  on  the  image. 
Figure  5.9  is  an  expanded  view  of  the  same  scene,  but  it  repre¬ 
sents  the  result  of  preprocessing  steps  that  isolate  the  large 
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Intensity 


b)  Result  of  Luminance  Thresholding 


Line  j 
Drawing"'* 

Only  - 

c)  Result  of  Edge  Detection  Algorithm 


d)  Line  Drawing  Shown  with  Encoding  Sample  Points  Artifically 
Marked 


111133555577 


e)  Chain  Encoded  Representation  (Starting  Point  =  A  Shown  Above) 


Figure  5.7.  Chain  Encoded  Image  Stream 


b)  Rectangular  Overlay  Representing  Hypothetical  Pixel  Array 
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c)  Aerial  Photograph  With  Hypothetical  Pixel  Array 


Figure  5.8.  A  Subimage  Ready  for  Encoding  (Airport) 
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complex  high  intensity  structure  represented  by  the  runway, 
service  road,  ramp  pattern  followed  by  an  edge  detection  algo¬ 
rithm  that  limits  the  product  image  to  a  line-drawing  output. 

Here  the  high  intensity  pixels  representing  the  outline  of  the 
airport  geometry  would  be  displayed  on  a  dark  background.  Shown 
also  in  the  Figure  are  "tick-marks"  along  the  line  drawing  that 
represent  sampling  points  where  encoding  will  be  performed. 

These  "tick-marks"  were  placed  entirely  randomly  here,  but  in 
a  realistic  situation  the  sample  points  would  be  adjacent  pixels 
in  a  rectangular  array.  As  a  result,  there  would  immediately 
be  introduced  a  measure  of  irregularity  in  an  otherwise  straight 
line  due  to  the  finite  pixel  size  in  the  image.  Therefore,  the 
sample  points  shown  are  only  for  demonstration  purposes. 

The  first  encoding  scheme  applied  to  the  line  drawing 
sample  points  is  a  raster  scan  code.  In  this  encoding  process 
the  X  and  Y  coordinates  of  high  intensity  pixels  are  merely 
read  sequentially  into  a  vector.  The  vector  may  be  either  used 
to  reconstruct  the  line  drawing  at  a  receiving  site,  or  in  this 
case  it  may  be  used  in  a  pattern-analysis  to  determine  its  shape 
or  some  other  key  attribute.  Table  5.3  illustrates  the  raster 
scan  result.  Here  only  every  fourth  point  was  encoded. 

In  a  second  effort  the  same  sample  airport  line  drawing 
geometry  was  chain  encoded  using  the  Freeman  encoding  scheme 
with  eight  levels  of  angle  encoding.  Shown  in  Table  5.4  is  a 
partial  symbol  output  vector.  The  vector  may  be  read  by  simply 
linking  subsequent  columns  in  the  table.  Several  symbols  appear¬ 
ing  in  the  table  should  be  defined  for  the  reader.  First,  the 
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Table  5.4 

Chain  Encoding  of  Periphery  of  Airport  Line  Drawing 
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implementation  of  this  algorithm  assumes  that  a  contour  follow¬ 
ing  algorithm  is  employed.  Code  symbols  may  therefore  include 
stop  and  start  points.  "A"  is  a  starting  point  and  "AE"  repre¬ 
sents  an  end  point,  in  addition,  a  run  symbol  "M"  was  included 
to  reduce  the  length  of  the  vector  when  multiples  of  a  symbol 
are  deduced  by  the  encoder.  The  small  letters  occurring  in  the 
table  refer  to  benchmark  points  that  are  shown  in  the  original 
line  drawing.  These  are  included  only  for  reference  purposes. 

It  is  interesting  to  note  that  the  chain-encoded  portions  of 
the  vector  representing  the  long  straight  sides  of  runway  sur¬ 
faces  are  interspersed  in  other  segments  that  represent  signi¬ 
ficant  detail.  If  this  attribute — "long  ribbon,  etc."  is  a  key 
attribute,  then  a  significant  amount  of  processing  of  the  en¬ 
coded  symbol  string  will  likely  be  necessary  to  extract  the 
desired  attribute.  Although  this  encoding  scheme  does  not  pre¬ 
clude  further  use  in  the  context  of  identification,  the  struc¬ 
ture  is  not  such  that  the  encoding  emphasizes  the  key  attribute 
identified  above.  If,  on  the  other  hand,  the  key  attribute  of 
interest  is  total  high  intensity  (paved)  area,  then  the  chain 
encoding  scheme  may  be  considered  appropriate,  since  area 
measure  is  very  straightforward  in  the  the  chain  encoded  format. 

Table  5.5  illustrates  an  output  that  would  be  obtained  if 
Hough  transformation  were  applied  to  the  line  drawing,  followed 
by  a  threshold  on  the  counts  obtained  in  the  quantization  bins. 
Assumed  here  are  fairly  small  levels  of  quantization.  Shown  in 
the  Table  are  three  numerical  entries  in  each  column.  The 
column  is  headed  by  a  letter  that  identifies  the  straight  line 
in  the  original  figure  that  resulted  in  the  Table  entry.  The 
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upper  figure  is  the  bin  count,  the  number  under  the  symbol  is 
the  angular  bin  number  (angle  of  line)  and  the  number  under  the 
symbol  is  the  value  of  rho  in  the  BIN  (perpendicular  distance 
from  origin) .  The  Table  not  only  indicates  the  relatively  long 
lines  (high  bin  count),  but  it  also  indicates  those  of  relatively 
equal  length  (possibly  representing  long  ribbonlike  structures — 
roads  or  runways).  It  also  offers  information,  as  noted  in 
Chapter  II,  that  will  allow  a  computation  of  distance  separating 
the  lines  (road  width  or  runway  width).  For  example,  by  merely 
consulting  Table  5.5  the  reader  can  immediately  determine  that 
lines  A,  B,  C,  and  D  are  long  and  parallel.  Furthermore,  by 
observing  the  value  of  rho  he  can  observe  that  lines  A  and  C 
are  separated  widely  (indicating  a  wide  expanse — perhaps  a  run¬ 
way)  and  that  lines  B  and  D  are  narrowly  separated  (indicating 
perhaps  a  narrow  road  or  taxiway) . 

In  the  latter  case  the  Hough  routine  is  immediately  recog- 
gnized  as  a  valuable  encoding  tool  because  it  possesses  the 
characteristic  that  it  serves  to  emphasize  a  key  property  that 
is  of  importance  in  identifying  the  target  of  a  particular  class. 
Chain  encoding,  while  of  some  value,  would  require  additional 
processing  subsequent  to  the  encoding  to  extract  the  key  attri¬ 
bute,  and  raster  scan  encoding  would  likely  be  of  even  lesser 
value. 


The  Airport  Recognition  Problem  -  Cues  and  Attributes 

Introduction 

An  airport  is  a  striking  target,  easily  identified  by  the 
human  observer  in  the  type  of  aerial  photograph  available  to 
this  study.  Because  of  this  observation,  it  seemed  that  this 


type  of  target  structure  apparently  offered  a  rich  set  of  cues 
or  attributes  with  which  to  identify  the  nature  of  the  target. 

The  objective  of  this  effort  was  to  identify  and  document  those 
attributes  and  to  formulate  a  methodology,  algorithms,  and  soft¬ 
ware  to  perform  the  identification  process. 

Before  further  researching  the  problem,  several  notions 
regarding  the  target  geometry  were  intially  considered  as  pos¬ 
sible  tools  for  use  in  the  identification  process.  In  some 
respects  these  tools  or  attributes  appear  to  be  valid  across 
the  board  as  several  diverse  airport  structures  are  considered. 
Given  the  limited  data  base  available,  a  more  complete  set  of 
aerial  data  was  obtained  and  studied  to  identify  more  accurate 
target  attributes. 

Much  of  the  thrust  of  preceding  efforts  has  used  geometry 
as  a  primary  means  of  sorting  or  identification.  This  thrust  has 
influenced  the  current  efforts  in  that  geometry  was  considered 
again  as  a  prime  candidate  in  investigating  the  means  necessary 
to  perform  the  airport  identification  process.  It  appears, 
however,  that  other  cues  may  also  become  extremely  valuable  in 
the  identification  process.  A  decision  tree  application  to 
the  airport  decision  problem  will  be  identified  and  described 
in  a  subsequent  discussion.  One  important  attribute  identified 
in  the  decision  tree  is  the  extremely  large  area,  smooth,  uni¬ 
form  intensity  runway  surface.  Although  not  a  positive  means 
of  identification,  these  attributes  can  increase  the  probability 
of  a  correct  decision  when  coupled  with  other  features. 
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Airport  Characteristics  -  A  Description 

One  method  for  identifying  objects  or  "targets",  as  they 
will  be  called  in  this  discussion,  is  to  simply  create  a  mask 
or  template  using  a  known  geometry  or  some  other  known  attri¬ 
bute.  Such  a  technique  is  extremely  applicable  in  the  case  of 
an  optical  character  reader  equipped  to  read  characters  from  a 
fixed  character  font.  In  this  case  the  target  set  is  a  fixed 
controlled  set  known  a  priori  to  the  character  reader.  The 
primary  problem  for  the  character  reader  is  then  that  it  must 
distinguish  one  character  from  another.  Generally,  all  of  the 
targets  confronting  the  character  reader  will  come  from  the 
fixed  character  set,  a  well  known,  well  characterized  number 
of  target  objects. 

In  the  current  study  the  situation  is  somewhat  different. 
Here  the  "reader"  may  confront  any  one  or  more  of  a  large,  al¬ 
most  unbounded,  number  of  targets.  Some  of  these  targets  may 
be  from  the  list  of  "important"  targets  identified  a  priori  to 
the  reader,  but  other  items  may  be  unknown.  Many  target  candi¬ 
dates  may  be  large  in  scale,  as  in  the  case  of  the  airport, 
and  other  candidate  targets  may  be  very  small,  as  in  the  case 
of  isolated  building  structures.  Therefore,  even  the  field  of 
view  and  orientation  with  respect  to  the  target  in  the  airport 
identification  will  be  unknown  a  priori. 

The  airport  identification  problem — indeed,  the  problem 
of  identifying  most  of  the  structures  in  the  list  of  important 
targets — is  further  complicated  in  that  the  targets  are  generic 
or  functional  structures  rather  than  specific  descriptions  that 
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possess  specific  characteristics.  A  simple  target  like  a  single 
isolated  building,  for  example,  may  vary  in  both  size  and  shape. 
It  may  even  have  a  black  roof  or  a  highly  reflective  roof,  yet 
it  is  said  to  belong  to  a  single  class  called  "detached  buildings 
The  same  is  true  of  the  airport  structure.  Some  major  airports 
will  include  several  landing  strips,  parallel  taxiways,  and 
immense  aprons,  while  others  will  be  more  compact.  One  propo¬ 
sition  put  forth  in  early  discussions  is  that  a  basic  structure 
is  that  depicted  in  Figure  5.10.  One  candidate  in  the  Figure 
shows  three  large  landing  strips,  roughly  in  a  triangular  pat¬ 
tern,  crossing  one  another  at  random  acute  angles.  The  second 
shows  a  simple  crossed-V  shape.  Such  a  geometry,  if  applicable, 
could  be  encoded  and  identified  based  on  code  symbol  character¬ 
istics.  The  identification  scheme,  it  was  postulated,  could 
be  sufficiently  general  so  as  to  accept  variations  in  specific 
runway  lengths  and  intersection  angles. 

Before  continuing  along  these  lines  it  was  imperative 
that  actual  target  characteristics  be  investigated  so  as  to 
ascertain  their  specific  nature.  Such  a  procedure  is  necessary 
in  the  case  of  generic  or  functional  targets  because  they  take 
so  many  forms.  The  extraction  of  features  (or  attributes)  thus 
becomes  a  necessity;  template-matching  is  ruled  out.  Expressed 
in  words,  such  a  feature  may  be,  for  example,  "long,  wide, 
straight  ribbons  of  concrete."  This  attribute  must,  of  course, 
be  capable  of  representation  in  machine  form,  and  the  words 
"long",  and  "wide"  must  be  reduced  to  some  numerical  range 


that  can  be  tested 
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Before  examining  specific  examples  and  identifying  speci¬ 
fic  attributes,  we  should  examine  some  ideas  that  were  proposed 
in  the  context  of  the  complete  cartographic  target  set.  This 
is  important  since  a  major  objective  of  this  program  is  to 
classify  targets.  In  the  discussion  regarding  the  categorization 
of  the  targets  into  classes  as  a  means  of  developing  attributes, 
the  airport  structure  was  identified  as  a  "complex"  structure  in 
that  it  incorporates  road  and  highway-like  structures  and  that 
it  also  incorporates  buildings  and  parking  facilities.  Often 
major  airports  are  located  in  relatively  rural  areas,  although 
some  examples  can  be  cited  where  large  cities  include  landing 
fields  within  their  boundaries  just  adjacent  to  large  urban 
concentrations.  One  method  that  could  be  used  to  identify  an 
airport  would  be  to  identify  its  fundamental  components  (i.e., 
runways,  taxiways,  hangars,  parked  aircraft,  parking  lots, 
administrative  and  support  buildings),  then  to  make  a  decision 
based  on  whether  or  not  the  image  contains  a  sufficient  set  of 
the  above  attributes  as  identified  in  a  previous  set  of  decision 
processes.  Such  a  process  is  shown  pictorially  in  Figure  5.11. 

It  is  fundamentally  a  process  based  on  concepts  of  syntactic 
pattern  recognition.  For  each  of  the  fundamental  structures 
runway,  hangar,  etc.)  an  identification  procedure  would  be 
applied  to  the  image.  A  syntax  of  geometrical  constructs 
(angles,  long  straight  sides,  etc.)  would  be  appropriate  to 
identify  some  of  the  candidate  components  of  the  airport  struc¬ 
ture.  Other  structures  would  be  identifiable  based  on  intensity 
or  texture  measures.  Then  at  the  next  higher  level  of  decision- 
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making,  component  descriptions  would  be  used  as  inputs  to  the 
final  decision  regarding  the  complex  structure.  Such  a  bottoro- 
up  approach  could  also  be  used  to  identify  urban  areas,  start¬ 
ing  first  by  identifying  component  structures  in  an  image,  such 
as  buildings  and  streets.  This  differs  from  the  top-down  approach 
where,  for  example,  a  texture  measure  may  be  used  to  determine 
if  an  image  depicts  an  urban  or  a  rural  area  simply  by  looking  at 
the  general  textural  measures  of  a  large  image  and  then  directly 
concluding  from  it  the  nature  of  a  complex  target  like  an  urban 
area. 

The  technique  adopted  as  a  result  of  this  study  may  be 
regarded  as  a  variant  of  the  bottom-up  approach  in  that  several 
characteristic  attributes  of  the  airport  structure  have  been 
identified,  and  these  are  used  as  intermediate  decision  points 
in  the  overall  decision  process.  Other  components  that  have  not 
been  currently  considered  in  the  logic  tree  structure  for  the 
airport  problem  have  been  what  are  considered  secondary  and 
tertiary  cues  like  buildings  and  parking  lots.  Table  5.6  in¬ 
cludes  several  cues  that  have  been  identified  as  characteristic 
of  most  airport  structures.  These  have  been  developed  based 
on  observation  of  sample  image  characteristics  and  direct 
observation  of  airport  geometries  and  characteristics  from 
the  air. 

Figures  5.12  through  5.21  are  line  drawings  of  ten  random¬ 
ly  selected  airports  from  an  official  document  published  by  the 
Air  Traffic  Service  of  the  Federal  Aviation  Administration. 

The  document  provides  major  airport  visual  outlines  and  brief 


Table  5.6 

Characteristics  of  Airport  Structures 

One  or  more  (1-5)  runways 

Runways  are  long  but  of  limited  (abrupt  end)  length 

Runways  are  extremely  straight  and  uniform 

Most  airports  include  secondary  roadlike  structures  as 
aprons  and  taxiways 

Runway  width  often  exceeds  that  of  standard  roads 

Airports  are  on  relatively  flat  land 

Airports  are  accessible  by  roads 

Runways  are  concrete — white  and  highly  reflective 

Most  airports  include  buildings  for  airport  administration 
and  hangars 

Most  major  airports  include  large  parking  facilities 

Often  the  region  enclosed  by  the  runway  and  taxiway  surfaces 
is  of  uniform  texture,  charactreristic  of  low  cut  vegetation 
to  facilitate  ground  visbility 
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Figure  5.15.  San  Diego  International 


119 

AIP-U.S.  AERODROME  CHART  •  COLD  BAY  AGA  2-10-3 


Figure  5.16.  Cold  Bay 


Figure  5.17.  Tampa  International 


JNITEO  STATES  AERODROME  CHART- CLEVELAND  HOPKINS  INTERNATIONAL 


Figure  5.19.  Cleveland-Hopkins  International 
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SIATES  AERODROME  CHART  JOHN  F  KENNEDY  INTERNATIONAL  AliA 


Figure  5.21.  Kennedy  International 
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descriptions  for  all  major  airports,  both  U.S.  and  international. 
The  data  in  this  document  provide  a  convenient  single  source  for 
characteristics  like  typical  runway  width  and  length,  number 
of  runways,  apron  geometry,  and  even  runway  slope.  Although 
the  ten  presented  here  were  selected  randomly  in  terms  of 
specific  location,  the  complexity  of  geometry  was  reviewed  and 
used  as  a  selection  device  in  order  to  obtain  a  wide  range  of 
shapes.  The  wide  range  of  geometries  displayed  in  the  volume 
is  perhaps  one  of  its  most  interesting  characteristics.  The 
majority  of  the  line  drawings  displayed  in  the  figures  are  for 
airports  located  in  the  continental  United  States.  These  are  to 
a  large  extent  indicative  of  the  images  that  may  be  constructed 
as  the  result  of  the  application  of  an  edge  detection  algorithm 
to  the  topographic  images  proposed  for  processing  in  the 


current  study. 

The  specific  airport  locations  depicted  in  the  Figure  are 
as  follows: 


Figure  No 


Location/Designation 


5.12 

5.13 

5.14 

5.15 

5.16 

5.17 

5.18 

5.19 

5.20 

5.21 


Yap 

Pago  Pago  international 
Eielson  AFB 

San  Diego  International 
Cold  Bay 

Tampa  International 
Baltimore-Washington  International 
Cleveland-Hopkins  International 
Dallas-Ft.  Worth  Regional 
Kennedy  International 


The  figures  are  arranged  in  a  generally  increasing  order 
of  geometrical  complexity,  starting  with  Yap  location,  which 
features  only  one  long  strip  of  moderate  length  (4,832  feet). 
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and  ends  with  perhaps  one  of  the  most  complex  airport  geometries 
in  the  world,  John  F.  Kennedy  Airport  in  New  York.  The  latter 
features  five  operational  runways  ranging  from  2,560  to  14,572 
feet  in  length.  The  array  of  taxiways,  aprons,  and  support 
roadways  is  extremely  complex.  These  elementary  components  are 
both  long  and  straight  and  curved  in  nature. 

These  figures  illustrate  the  wide  range  of  sizes,  geo¬ 
metries,  and  complements  of  facilities  that  can  be  aggregated 
to  form  the  functional  unit  that  can  be  ascribed  to  the  class 
generally  called  "airports."  From  these  figures,  however, 
several  very  valuable  observations  can  be  made  regarding  general 
attributes.  In  general,  these  attributes  when  used  in  conjunc¬ 
tion  with  one  another,  form  a  set  unique  to  the  generic  airport 
geometry  and  thus  provide  characteristics  that  can  be  used  to 
uniquely  identify  an  unknown  image  if  that  image  is  indeed 
that  of  an  airport.  The  primary  objective  here  is  to  identify 
a  small  set  of  unique  attributes  that  occur  often  in  an  airport 
geometry  and  concurrently  do  not  occur  often  in  images. 

Seme  of  the  major  characteristics  of  the  ten  selected 
facilities  are  shown  in  Table  5.7.  At  the  top  of  the  table  is 
the  simplest  facility,  which  is  merely  a  simple  single  rectang¬ 
ular  strip  100  feet  in  width,  4,832  feet  in  length,  with  straight 
ends,  no  taxiways,  and  a  simple  terminal  facility  located  at 
approximately  the  60/40  point  along  the  strip.  The  facilities 
become  increasingly  complex  as  the  entries  become  lower  in  the 
chart.  The  most  complex  entry,  Kennedy  International,  New  York, 
contrasts  greatly  with  the  first  entry.  As  an  indication  of 
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the  degree  of  slope  in  these  typical  airport  structures,  the 
most  level  facility  exhibits  zero  longitudinal  slope  over  its 
two  mile  runway  length  and  the  worst  exhibits  a  0.5%  longitudi¬ 
nal  slope  (45  feet  over  its  9,000  foot  length).  Since  the  larg¬ 
est  facility,  Kennedy  International,  exhibits  the  most  complex 
structure,  it  serves  to  indicate  the  characteristics  that  can 
be  used  for  identification  purposes  and  includes  many  secondary 
characteristics  that  must  be  overlooked  if  the  identification 
alogrithm  is  to  be  successful.  (More  specifically,  the  algo¬ 
rithm  must  suppress  those  characteristics  that  do  not  play  a 
part  in  providing  decisions  as  the  decision  process  is  carried 
out. ) 

These  latter  characteristics  may  be  identified  for  the 
purposes  intended  here  as  extraneous  to  the  identification  pro¬ 
cess.  If,  for  example,  a  key  characteristic  that  cuts  across 
all  of  the  facilities  is  identified  as  "runway(s)  with  an 
extremely  straight  character  and  of  limited  length",  then 
other  characteristics  of  the  complex  target  may  be  regarded  as 
noise.  Connecting  taxiways  and  service  roads  or  aprons  that 
may  connect  to  the  main  runways  are  examples  of  these  extraneous 
attributes  when  viewed  in  the  context  of  the  single  attribute 
identified  above.  This  discussion  and  the  more  complex  facility 
descriptions  (Figures  5.19,  5.20,  5.21)  illustrate  the  necessity 
for  processing  or  encoding  of  the  image  or  one  of  its  transformed 
products  with  a  technique  that  is  designed  to  select  effective 
features.  The  generic  nature  of  the  airport  configuration,  and 
the  need  to  develop  techniques  that  emphasize  key  attributes 
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heavily  affects  the  selection  of  encoding  schemes  that  may  be 
considered  for  use  in  conjunction  with  the  target  recognition 
process.  This  area  of  interest  is  addressed  in  greater  detail 
elsewhere  in  this  report.  Several  encoding  techniques,  to  in¬ 
clude  raster  scan  encoding  of  the  raw  image  and  Freeman  encoding 
of  the  line  drawing  produced  by  an  edge  detection  algorithm, 
were  considered  during  the  course  of  the  period  by  looking  at 
an  empirical  characterization  of  candidate  images.  Hie  conclu¬ 
sion  of  this  brief  effort  was  that  in  the  very  complex  facility- 
related  images  these  techniques  tend  not  to  deemphasize  the 
detail  associated  with  the  image  complexity.  The  result  is  a 
very  complex  encoded  characterization  that  includes  the  key 
attribute(s)  of  interest  as  well  as  a  large  amount  of  extraneous 
information  mixed  into  the  key  data  stream.  The  type  of  charac¬ 
terization  forces  the  subsequent  analysis  steps  to  carry  the 
burden  of  sorting  out  the  key  attribute  data. 

It  is  particularly  important  to  emphasize  the  points  elucidated 
in  the  paragraph  above,  since  on  the  one  hand  an  objective  of 
the  study  was  to  investigate  the  opportunities  that  could  be 
obtained  from  image  encoding  in  the  image  identification  con¬ 
text.  On  the  other  hand  the  major  objective  was  to  investigate 
identification  schemes  appropriate  to  the  targets  of  interest. 

A  general  conclusion  here  is  that  in  the  case  where  the  target 
is  a  complex  generic  target,  the  types  of  encoding  schemes 
that  were  investigated  in  this  study  are  of  limited  value.  If, 
on  the  other  hand,  the  definition  of  encoding  is  extended  to 
include  processes  like  the  Hough  transform,  then  encoding  can 
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be  considered  a  useful  technique  in  the  overall  target  identif¬ 
ication  process.  As  a  matter  of  fact,  the  Hough  transform 
technique  is  particularly  useful  in  the  airport  identification 
process  because  its  characteristics  allow  the  emphasis  of  a  key 
attribute  in  the  airport  problem — the  long,  straight,  abruptly 
terminating  runway  and  taxiway  surfaces,  while  deemphasizing 
edges  that  do  not  contribute. 

Let  us  consider  the  key  attributes  that  have  been  identi¬ 
fied  as  indicative  of  an  airport  facility  regardless  of  the 
complexity  and  level  of  detail  associated  with  the  overall 
structure.  These  are  identified  as  follows: 


•  One  or  more  Runway  Surfaces 


•  Two  or  more  pairs  of  Edges 


•  Edges  lie  along  extremely  straight  lines 

•  Line  length  4,000  -  15,000  feet  long 

•  Edge  pairs  75-200  feet  apart  (150  feet 

appears  most  likely) 

•  Other  edge  pairs  parallel  to  the  first 

May  be  of  lesser  or  equal  width 
(Taxiways  vs  parallel  runways) 


•  Most  major  airport  facilities  include 
paved  areas  of  500,000  square  feet  as  a 
minimum;  large  scale  airports  contain 
5,000,000  square  feet  of  paved  areas  as 
a  minimum 
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These  very  simple  characteristics  are  present  in  all  of 
the  facilities  identified  here  and  certainly  do  not  include 
many  of  the  available  cues  identified  earlier.  They  are, 
however,  implemented  easily  and  would  appear  to  be  effective 
features  due  to  their  unique  character  with  respect  to  the 
other  types  of  candidate  targets  expected  in  the  images. 

Implementation 

The  requirements  for  analysis  and  an  algorithm  resulting 
from  the  characteristics  identified  in  the  preceding  paragraphs 
are  very  similar  to  those  identified  in  the  earlier  bridge 
related  analysis.  Of  particular  importance  are  the  require¬ 
ments  to  identify  long  (4,000  -  15,000  feet)  parallel  lines  in 
the  images  and  to  determine  their  separation  distances.  The 
Hough  transform  and  the  algorithms  developed  in  support  of  the 
bridge  identification  effort  will  require  minor  alteration  to 
allow  this  analysis  using  tools  that  are  available.  It  is 
necessary  that  the  edges  representing  the  runways  be  exposed 
to  a  very  fine  degree  of  angle  and  intercept  quantization  to 
assure  that  the  edge  that  is  under  investigation  is  indeed  a 
straight  edge  as  is  characteristic  of  the  airport  runway  sur¬ 
face.  End  points  in  the  space  domain  must  be  recorded  during 
the  analysis  to  assure  that  the  long  parallel  edges  do  span 
distances  in  the  4,000  to  15,000-foot  rajige,  and  that  the  ex¬ 
tent  of  the  span(s)  is  indeed  finite  and  ends  abruptly.  Hie 
inverse  Hough  routine  described  above  may  prove  useful  in 
carrying  out  the  analysis  related  to  the  determination  of 
abruptly  terminating  spans. 
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Analysis  with  respect  to  the  extremely  large  paved  sur¬ 
face  area  will  be  facilitated  by  the  fact  that  most  airport 
surfaces  are  constructed  of  concrete  to  enable  them  to  support 
aircraft.  These  areas  will  be  large  (greater  than  500,000  ft2 
and  extending  up  through  10,000,000  ft2)  and  will  include 
the  regions  identified  in  the  Hough  portion  of  the  analysis  as 
potential  runway  surfaces.  The  extent  of  the  large  area  target 
(bounding  rectangle)  in  terms  of  coordinates  will  be  roughly 
15,000  by  15,000  feet.  The  analysis  here  will  require  thres¬ 
holding  of  the  raw  image  followed  by  a  routine  that  will 
examine  the  sizes  of  the  regions  of  uniformly  high  intensity. 


CHAPTER  VI 

CONCLUSIONS  AND  RECOMMENDATIONS 
FOR  FURTHER  WORK 

Conclusions 

Progress  has  been  made  during  the  past  year  in  the  devel¬ 
opment  of  computer  techniques  to  extract  items  of  cartographic 
interest  from  aerial  photographs.  The  problem  as  originally 
posed  was  the  construction  of  a  system  that  can  perform  auto¬ 
matic  or  semi-automated  cartographic  analysis;  a  number  of 
subsidiary  problems  were  identified  that  provided  the  pattern- 
recognition  framework  for  the  overall  problem.  To  date  we  have 
examined  two  types  of  man-made  cartographic  objects — bridges 
and  runway  configurations — in  a  variety  of  surroundings.  Re¬ 
sults  were  good,  indicating  that  it  is  possible  to  locate 
accurately  those  objects  in  the  image,  using  a  combination  of 
edge-detection  and  transform  techniques  and  certain  a  priori 
information  about  the  two  types  of  objects. 

The  success  to  date  provides  support  for  the  use  and  en¬ 
hancement  of  the  Hough  transform  as  a  locator  of  lines  and  line 
segments;  since  lines  often  characterize  man-made  cartographic 
objects,  there  is  value  in  further  development  of  that  tool. 

The  use  of  a  large  number  of  picture  segments,  coupled  with  a 
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variety  of  object/area  types  is  necessary  for  completion  evalua-  . 
tion  of  that  approach  to  feature  extraction. 

The  medial  axis  transform  has  been  refined  for  use  in  the 
discrete  case  and  is  expected  to  be  valuable  in  representing 
largearea  types  of  objects  (as  opposed  to  bridge  and  runway 
targets) . 

The  success  of  various  features  used  in  recognition  of 
cartographic  objects  depends  in  part  on  the  spatial  resolution 
and  intensity  quantization  of  the  digitized  image.  Although 
the  present  values  of  resolution  and  quantization  are  adequate, 
it  may  be  that  less  of  either  or  both  would  also  be  adequate; 
in  cases  where  memory  is  limited  one  many  have  to  make  a  trade 
between  number  of  pixels  and  number  of  gray  levels  per  pixel. 

The  effect  of  such  a  trade  on  the  effectivness  of  various 
features  is  unknown. 

Interesting  and  Important  Problems 
It  quickly  becomes  clear  that  pictures  can  be  expressed 
as  syntactic  structures,  composed  of  multiple  occurrences  of 
members  of  a  small  set  of  primitive  shapes.  Cartographic 
images  can  be  expected  to  have  a  set  of  elementary  components 
that  can  be  combined  into  strings  the  parsing  of  which  will 
yield  the  classes  of  the  objects  of  interst.  The  problem  is 
to  identify  such  a  set  and  demonstrate  its  practicality  and 
accuracy. 

The  degree  of  invariance  of  features  as  the  tradeoff 
(for  fixed  memory  size)  between  number  of  pixels  and  number  of 
levels  occurs  can  be  measured  by  a  sensitivity  analysis.  It 
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will  be  of  great  theoretical  and  practical  value  to  know  where 
resources  (both  memory  and  computing)  should  be  devoted.  Fur¬ 
ther,  depending  on  the  cartographic  item  of  interest  the 
allocation  of  resources  may  be  changed  during  processing.  Part 
of  this  problem  is  affected  by  the  goal  of  removing  redundancy; 
as  noted  in  this  report,  a  homogeneous  area  of  large  size  need 
not  have  all  of  its  component  pixels  identified  individually, 
which  would  allow  a  compact  representation  to  guide  the  feature 
extraction. 

Classifier  design  (except  for  the  case  of  a  syntactic 
approach)  is  a  worthwhile  area  for  study  because  it  uses  what¬ 
ever  information  is  available  from  the  features  to  partition  the 
space  to  minimize  the  probability  or  the  expected  cost  of  mis- 
classif ication.  One  approach  uses  features  sequentially,  thus 
computing  them  only  as  needed  to  reach  a  preset  level  of  error 
probability.  Methods  exist  that  are  optimal  in  the  sense  that 
the  number  of  measurements  required  is  the  minimum  necessary 
to  achieve  the  specified  error. 

It  is  essential  to  extend  the  work  already  done  to  new 
types  of  objects,  to  larger  numbers  of  them  (to  obtain  good 
estimates  of  classifier  performance),  to  a  variety  of  back¬ 
grounds  and  qualities  of  photographs,  and  to  imagery  of  other 
kinds  (radar,  infrared,  etc.). 
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Appendix  3.1.  Downsampling  Algorithm 
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Appendix  3.2.  Algorithm  for  Noise-Cleaning 


PROGRAM  HOUGHTRAN 

IMPLICIT  INTEGER*2  ( I » J*K.L*M*N) 

-THIS  SECTION  REQUESTED  THE  PARAMETERS  NEEDED  BY  THE  VARIOUS 
-ROUTINES  OF  THE  PROGRAM. 

CHARACTERS  IAN. NTY  * IAN2 
CHARACTERS  FNTUO 
CHARACTER*?  FNAME 

-THE  NEXT  2  LINES  ASK  IF  THE  USER  WISHES  TO  PROCESS  INTER¬ 
ACTIVELY  (ON  1  SUBIMAGE  AT  A  TIME)  OR  AUTOMATICALLY  (PROCESS 
-ENTIRE  IMAGE  WITHOUT  USER  INTERACTION). 

1  WRITE(6»*) 'PROCESSING  MODE  (0= INTERACTIVE*  1-AUTOMATIC) ' 
READ(5»*)  MP 

WRITE(6»*)  'QUANT. LEVELS  OF  RH0K256)*  THETA(<13>' 

READ ( 5  *  *  >  LR.LA 

WRITE(6f *)  '*  OF  R0WS(<256)  AND  COLS  (<256)  TO  PROCESS' 
READ(5**)  NX*  NY 

WRITE(6»*> 'WINDOW  SIZE  FOR  GLOBCON  TEST  (DEFAULT88?)  *  ' 

NW88? 

READ(5>«)  NW 
NW=(NW-l)/2 
IF(MP.EQ.l)  GO  TO  2 

-THE  FOLLOWING  ASKS  WHAT  OUTPUT  TO  PERFORM  DURING  ROUTINE 
WRITE(4**>  'DISPLAY  EDGE  FIELD?  (Y/N)' 

-IF  ' ¥•'  IS  ENTERED*  THE  RESULTS  OF  EDGE  DETECTION  IS  OUTPUT 
READ (5 *500)  NTY 
500  FORMAT (Al) 

WRITE(6»*>  'DISPLAY  HOUGH  TRANSFORM  MATRIX?  (Y/N)' 

-IF  'Y't  THE  TRANSFORM  MATRIX  IS  OUTPUT 
READ(5»500)  IAN2 

WRITE(4» *)  'DISPLAY  INVERTED  TRANSFORM?  (Y/N)' 

-IF'Y' *  INVERTED  TX  DISPLAYED  SUPERIMPOSED  ON  EDGE  FIELD 
READ(5f 500)  IAN 

-THE  FOLLOWING  REQUESTS  THRESHOLDS  FOR  VARIOUS  TESTS  INVOLVED 
-IN  EXTRACTING  LINES*  BRIDGE  CANDIDATES*  AND  BRIDGES 

2  WRITE(6»*> 'FOR  FOLLOWING  THRESHOLDS*  ENTER  0  FOR  DEFAULT' 
WRITE(6»*>  'EDGE  DETECTOR  THRESHOLD  (DEFAULT-5000.0) J ' 

READ (5**)  AX 

IF(AX.EQ.O)  AX-5000.0 

WRITE(6**>  'MATRIX  INVERSION  THRESHOLD  (DEFAULT-10);' 

READ(5*«)  IG 

IF(IG.EOvO)  IG-lO  - - - 

WRITE(6**> 'PARA. LINE  SEPAR.  THRESHOLD (DEFAULT-5  PIXELS) ' 
READ(5»*>  ST 
IF(ST.EQ.O)  ST-5.0 

WRITE(4**> 'MAX. LINE  PAIR  OFFSET  (X  OF  AVG  LNTH* DEFAULT-. 5) ' 
READ(5*«)  OT 
IF(OT.EQ.O)  0T-0.5 

WR ITE  ( 6  *  * ) '  MAX  LENGTH  DIFFERENCES  OF  AVG  LGNTH)  (DEFAULT-0.5) ; ' 
READ (5**)  DT 
IF(DT.EQ.O)  DT-0.5 

WRITE(6»*> 'UNIFORMITY  THRESHOLD  FOR  GLOBCON(DEFAULT-15.0) t ' 
READ(5**)  BT 
IF(BT.EQ.O)  BT-15.0 
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3  WRITE (6  f  #>  'INPUT  FILE  NAME  (5  DIGITS?  IEf  PIC05)' 

READ <5* 501 >  FNTWO 

501  FORMAT (A5) 

FNAME=FNTWQ// ' .DAT' 

WRITE (6f4)  'INPUT  RECORD  LENGTH' 

READ  ( 5  f  # )  LH 
IF(MP.EQ.l)  GO  TO  20 

WRITE(6f*) 'STARTING  COORDS  (UPPER-LEFT  CORNER f  ROWfCOL)?' 
READ(5r#>  IXfIY- 

-IMAGE  TO  BE  OPERATED  ON  IS  READ  INTO  MEMORY 
CALL  RIM(NXfNYfIXfIYf FNAME f  LHfBR) . 

-HOUGH  TRANSFORM  IS  PERFORMED 

CALL  HOUGH  ( LR  f  LA  f  I X  f  I Y  f  NX  f  NY  f  AX  f  NTY ) 

-MATRIX  IS  OPTIONALLY  DISPLAYED 
IF( IAN2.EG« 'N' >  GO  TO  4 
CALL  MATRIX (LRf LA) 

-THE  TRANSFORM  IS  INVERTED  (IEf  LINES  ARE  EXTRACTED) 

4  CALL  INVERT (LRfLA f IG  f IX f IY  fNX fNY  f IANfN) 

-CANDIDATES  ARE  EXTRACTED  AND  TESTED  USING  GLOBCON 

CALL  PARLINES(Nf IXf IYfSTfOTfDTfBT f FNAME fLHf 'Y' fOfNW) 

GO  TO  5 

-THE  FOLLOWING  SECTION  PROCESSES  AN  ENTIRE  IMAGE  WITHOUT 
-FURTHER  INTERACTION  BY  CALCULATING  EACH  SUCCEEDING  SET 
-OF  STARTING  COORD'S  AND  USING  THE  VALUES  ENTERED  ABOVE* 

20  WRITE (6*13) 

13  FORMAT ( 'O'* 'CTR(XfY)  ANGLE  LNGTH  WIDTH  *LINES  RESULT' ) 
DO  15  K=2f1024fNX 
NY2=NY 
NX 2= NX 

DO  15  J=2  *  LH-2  *  NY 

IF ( (LH-J) *LT *NY)  NY2«LH-J 

CALL  R I M ( NX2  f  NY2 f  K  f  J  f FNAME  f  LH  f  8R  > 

IF (NX2.EQ.0)  GO  TO  5 

CALL  HOUGH (LRf LA f IX* I Y fNX2fNY2f AXf ' N' ) 

CALL  INVERT (LRfLA fIGf IXf IYfNX2fNY2f 'N'fN) 

15  CALL  PARLINES(NfKf JfST fOTfDTfBT f FNAME fLHf 'Y'fIfNW) 

5  WRITE(6»*) 'NEXT  PROCEDURE?  ENTER  1-DIGIT  CODE' 

WRITE(6»«)  '  1)  PROCESS  ANOTHER  IMAGE-NO  CHANGES' 

WRITE(6f*>  '  2)  PROCESS  ANOTHER  IMAGE-NEW  THRESHOLDS' 

WRITE(6f*)  '  3)  PROCESS  ANOTHER  IMAGE-NEW  OPTIONS' 

WRITE(6r*)  '  4)  STOP' 

-IF  '1'f  ONLY  IMAGE  FILE/COORDS  ARE  REQUESTED.  IF'2'»  NEW 
-THRESHOLD  VALUES  ARE  REQUESTED  ALSO.  '3'  RESTARTS  PROGRAM. 
READ(5f*>  NPROC 
GO  TO  (3f2f1f99)f NPROC 
99  STOP 
END 

SUBROUTINE  HOUGH ( LR  f  LA  f IX  f I Y  f  NX  f  NY  f  AX  f  NTY  > 

IMPLICIT  INTEGER42  ( I f JfKfLfMfN) 

-THIS  ROUTINE  PERFORMS  A  'FAST'  HOUGH  TRANSFORM  (DUE  TO 
-O'GORMAN  t  CLOWES)  ON  THE  IMAGE  ARRAY  'R'.  FEATURE  POINTS 
-TO  TRANSFORM  ARE  GENERATED  BY  A  3X3  EDGE  DETECTER  MASK. 

-THE  TRANSFORM  IS  'FAST'  SINCE  ONLY  A  SINGLE  RHO/THETA  CELL 
-IS  INCREMENTED  PER  FEATURE  POINT— THAT  ONE  CORRESPONDING  TO 
-THE  ANGLE  RETURNED  BY  THE  EDGE  DETECTER. 
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COMMON  R ( 256 * 256  > /HUFF/HF ( 1 3 » 256 * 4  > » NTOT <  256  >  »  XX 
COMMON  /CH/IMG(256»256)/LT/A( 13x3) »PI 
INTEGER*2  HF  *  R  * RHO 
CHARACTER*!  IMG*NTY 
PI“ATAN( 1 «0)*4. 

-INITIALIZE  THE  RHO/THETA  MATRIX! 

DO  1  K*1*LR 
DO  1  J=1 *LA 
DO  1  L=l»4 
1  HF  ( J*K»L>=0 

IF(NTY.NE. 'Y' >  GO  TO  99 
WRITE(6»98>  IX*IY*AX 

98  FORMAT  ('O'*'  EDGE  FIELD*  COORDS'  *215*  '  THRESHOLD* ' » FB . 1 ) 
“XX*  CALCULATED  IN  THE  NEXT  STEP*  CONVERTS  RHO  (DISTANCE) 

-TO  'RHO' »  THE  BUCKET  NUMBER. 

99  XX*(LR-1. )/(NX+NY> 

-THE  NEXT  10  LINES  DEVELOP  A  'LOOK-UP'  TABLE  TO  FACILITATE 
-THE  CALCULATION  OF  ThE  RHO/THETA  BUCKET  FOR  EACH  POINT. 

DO  4  K*1 »LA 
A(K*1)»(K“1)*PI/LA 
A(K*2)~A(K* 1) 

A(K.3)-1 

-1F(A(K»2>  . LE .  ( PI/4 . ) , OR .  A ( K . 2 ) . GE ♦  (  . 75#PI  )  )  GO  TO  3 
A(K»2)=*(PI/2.)-A<K*2> 

IF(COS( A(Kf 1 ) ) «LT.O)  A(Kf3)=NX*XX+1 
GO  TO  4 

3  IF(COS( A(Kf 1 > ) .LT.O)  A(K* 3>=NY*XX+1 

4  CONTINUE 

-THE  REMAINDER  PERFORMS  THE  TRANSFORM.  FOR  EACH  PIXEL*  THE 
-EDGE  OPERATOR  'EDGOP'  IS  CALLED*  WHICH  RETURNS  AN  EDGE 
-INTENSITY  B2*  ANGLE  A2*  AND  A  THIRD  VALUE  'IS'  WHICH  IS 
-+1  IF  A2  IS  BETWEEN  0  t  PI*  AND  -1  IF  A2  IS  BETW.  PI  ft  2PI. 
-B2=0  IF  THE  PIXEL  IS  NOT  AN  EDGE  (IE*  FEATURE)  POINT.  THE 
-CORRESP.  L/RHO  BUCKET  IS  THEN  CALCULATED.  LINES  ARE  HANDLED 
-SLIGHTLY  DIFFERENT  DEPENDING  ON  ITS  ANGLE!  THOSE  BETWEEN 
-PI/4  AND  3PI/4  CALCULATE  RHO  AS  A  HORIZ.  AXIS  INTERCEPT. 
-THOSE  <  PI/4  OR  >3PI/4  USE  THE  VERT.  AXIS  INTERCEPT  AS  RHO. 
-LINES  WITH  ANGLES  >PI/2  (NEG.  SLOPES)  MUST  HAVE  EITHER  NX  OR 
-NY  ADDED  TO  RHO  IN  ORDER  THAT  RHO  'BUCKET  NUMBER'  NEVER  BE 
-NEGATIVE.  THIS  PROCEDURE  DIFFERS  FROM  THE  MORE  COMMON  USE 
-OF  RHO  AS  THE  PERPENDICULAR  DISTANCE  OF  ANY  LINE  TO  THE 
-ORIGIN*  BUT  HAS  THE  ADVANTAGE  OF  NOT  GIVING  SPURIOUS  'SPLIT' 
-LINES  DUE  TO  IMAGE  DIGITIZATION* 

DO  5  K-1*NX 
DO  5  J«1»NY 

5  IMG(K*J)«'  ' 

DO  15  K»1*NX 
DO  10  J«1»NY 

CALL  EDGOP (J+1*K+1*AX*B2*A2*IS) 

IF (B2.EQ.0)  GO  TO  10 
IMG(K*J)-'*' 

L«A2*LA/PI+1 
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-BESIDES  THE  USUAL  INCREMENTING »  EACH  RHO/THETA  CELL  KEEPS  2 
-OTHER  VALUES:  THE  AVERAGE  X  VALUE  <A2  >3PI/4  OR  <PI/4>  OR 
-Y  VALUE  <PI/2<A2<3PI/4)  FOR  THE  1ST  3  PTS  AND  FOR  ALL  PTS 
-ALONG  THAT  LINE* (THESE  ARE  USED  LATER  TO  ESTIMATE  ITS  CENTER 
-AND  LENGTH) »  THE  3RD  ADD.  VALUE  SUMS  THE  'IS'  VALUES r  THUS 
-INDICATING  IF  THE  LINE  IS  PRIMARILY  'POSITIVE' <0-PI>  OR 
-'NEGATIVE' <PI-2*PI) . 

RHO»(J*SIN<A<L» 1) )+K*CQS<A<L»l> ) >*XX/C0S(A(L»2) )+A(L»3> 
IV-J 

IF<HF<L»RH0»3) .GT. 31700)  GO  TO  10 
HF(Lf RHOf 1 >=HF (L  >RHO» 1 >  +  l 

IF<A<Lfl).GT.(PI/4.),AND.A<L»l) .LT.(.75*PI))  IV-K 
HF<LfRHOf3T=HF<L.RHO»3)+IV 

IF<HF <L r RHO» 1 ) «LE • 3 )  HF  <  L » RHO ? 2 ) *HF < L » RHO » 2 ) + 1 V 
HF  <  L  »  RHO  »  4 ) -HF  <  L  *  RHO » 4 )  + 1 S 
10  CONTINUE 

-DISPLAYS  THE  JUST-PAST  LINE  OF  EDGE  POINTS  IF  NTY-'Y'. 
IF<NTY.EQ» ' Y' )  WRITE<6»*)  < IMG<K» J) r J«1 »NY) 

IS  CONTINUE 
RETURN 
END 

SUBROUT I NE  EDGOP <J»KrAX.B2»A2»IS) 

IMPLICIT  INTEGER*2  < I » J»Kf LrMrN) 

-THIS  ROUTINE  IS  A  MODIFICATION  OF  THE  'SOBEL'  EDGE  OPERATOR. 
-IT  USES  THE  WEIGHT  SQRT(2>  (DUE  TO  FREI  l  CHEN)  RATHER  THAN 
-2-RESULTING  IN  UNIFORM  PROB  OF  EDGE  DETECTION  FOR  ANY  ANGLE. 
COMMON  R<256»254)/LT/A( 13*3) rPI 
INTEGER42  R 

E»R(K-lrJ“l)-R(K+lf J+l> 

D»R<K-lrJ+l)-R(K+l»J-l) 

F«E+D+1.414*(R(K-1» J)-R(K+1»J) ) 

G»E“D+1.414*(R(K» J“1)“R(K»J+1> ) 

B2=F**2+G**2 
IF(B2.LT. AX)  B2-0 
IS-1 

-'IS'  IS  SET  TO  -1  IF  THE  ANGLE  IS  BETWEEN  PI  AND  2*PI. 
IF(G.LT.O)  IS— 1 
IF(F.EO.O)  GO  TO  2 
A2-ATAN2(G»F> 

-THE  NEXT  LINE  SHIFTS  RANGE  FROM  <-PI/2»PI/2)  TO  <0»PI> 
IF<A2.LT .0)  A2-PI+A2 
GO  TO  3 

2  A2-(PI/2.) 

3  RETURN 
END 

SUBROUTINE  MATRIX(LRrLA) 

IMPLICIT  INTEGER*2  (I * J*K*L*M*N> 

-THIS  ROUTINE  DISPLAYS  THE  RHO/THETA  'INCREMENT'  CELLS. 

COMMON  /HUFF/HF ( 1 3 » 256 » 4 ) r NTOT  <  236 ) r XX 
CHARACTER*3  DASH 
INTEGER*2  HF 
WR1TE<6* 1 )  ( J* J-l *LA) 

1  FORMAT < '  '  »8Xr 2013). 

DASH-' - ' 


WRITE(6r2)  (DASH*J«1»LA> 

2  FORMAT <'  '  f 9Xf20A3) 

DO  5  K-IfLR 

5  WRITEC6»6)  K»<HF(LfKfl)»L*lfLA) 

6  FORMAT  <  '  '»I4»4X»20I3> 

RETURN 

END 

SUBROUT INE  RIM(NX»NY»IX»IY» FNAME t LH » BR > 

IMPLICIT  INTEGER*2  ( I » J»K»L»M»N> 

-THIS  ROUTINE  READS  IN  AN  IMAGE  OF  LENGTH  LHf  LOCATED  IN 
-FILE  IN*  'NX'  ROWS  BY  'NY'  COLUMNS  (STARTING  WITH  UPPER 
-LEFT  PIXEL  < IX» IY)  ARE  STORED  IN  THE  INTEGER*2  ARRAY  'R'. 
COMMON  R  <  256 » 256 ) 

INTEGER*2  R»BR(LH> 

CHARACTER*9  FNAME 

OPEN ( ACCESS3 ' DIRECT ' t NAME-FNAME t TYPE® ' OLD ' »  UNIT-21) 

DO  5  K-IX-1 t IX+NX 
READ (21 'K»END=10)  BR 
DO  5  J=IY-1 r IY+NY 
5  R(K-IX+2* J-IY+2 )=BR( J> 

GO  TO  20 
10  NX=K-IX+1 
20  CLOSE (UNIT-21) 

RETURN 
-  END 

SUBROUTINE  INVERT<LR»LA»IG»IX2fIY2»NXfNY»IAN»N> 

IMPLICIT  INTEGER*2  (I» J»K»L»MfN) 

-THIS  ROUTINE  INVERTS  THE  TRANSFORM  PERFORMED  BY  'HOUGH'.  IT 
-EXTRACTS  FROM  THE  RHO/THETA  MATRIX  THOSE  CELLS  EXCEEDING  THE 
-THE  THRESHOLD  'IG'.  FOR  EACH  EXTRACTED  CELL  IT  CALCULATES 
-THE  CENTER  POINT  (XSY)r  HALF-LENGTH »  SLOPE  *  INTERCEPT  t  THEN 
-STORES  THESE  IN  THE  ARRAY  'IL'*  THE  LINE  IS  THEN  BACK- 
-PROJECTED  ONTO  THE  EDGE  FIELD  'IMG'  FOR  OPTIONAL  DISPLAY* 
-THE  COORDINATE  SYSTEM  USES  THE  UPPER-LEFT  CORNER  AS  THE  ORI- 
-GIN*  BUT  IS  ROTATED  90  DEGREES  FOR  LINES  WITH  ANGLES  BETWEEN 
-PI/4  AND  3PI/4. 

COMMON  /HUFF/HF ( 13 » 256 » 4 ) r NTOT ( 256 ) r XX/CH/IMG ( 256  * 256 ) 
COMMON  /LZNE/IL(128r6) »SLP(128) fCPT( 128)/LT/A< 13f 3) rPI 
INTEGER*2  HF 
CHARACTER*!  IMG » IAN 
N-0 

IF ( IAN.NE* ' Y' )  GO  TO  99 
URITE<6'98>  IX2f IY2r IG 

98  F0RMAT('0'» 'INVERTED  IMAGE »  COORDS' r 215 t '  THRESH.-' » 14) 

99  DO  20  K-l »LA 
DO  20  J-l fLR 

IF(HF(KfJ»1).LT*IG)  GO  TO  20 
N-N+l 

IL(N*1)»HF(K*Jf1) 

IL(N»2)“(K-1 >*180./LA 
IL(N»3)-HF(K» Jf3)/IL(N» 1 > 

IL(N»5)*ABS( IL(N»3)-HF (K»J»2)/3)+l 
IL(N» 6)-HF (K  r J»4 ) 

CPT  <  N ) ■ <  J-A ( K » 3 ) ) /XX 
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SLP<N)«-TAN<A<K»2) ) 

IL<N»  4 )=SLP  <  N ) #IL (N»  3)+CPT  <N) 

IF< IAN.EQ. 'N' )  GO  TO  20 

DO  15  M=IL<N»3)-IL<N*5)»IL<N»3)+IL(N»5)  v 

IY»SLP(N)*M+CPT<N) 

IF(M.LT.l.OR.IY.LT.l)  GO  TO  15 

IF<A<K»l).GT.<PI/4.) » AND • A ( K* 1 ) . LT «  < . 75*PI >  >  GO  TO  13 
IF<M.LE.NY. AND . I Y « LE • NX )  IMG<  IY»M)-'X' 

GO  TO  15 

13  IF<M«LE.NX. AND. IY.LE .NY)  IMG<M» IY)-'X' 

15  CONTINUE 
20  CONTINUE 

IF < IAN.EQ. 'N'  )  GO  TO  30 
DO  18  K*1 f NX 

18  WRITE<6,*)  ( ING(K » J) f J=1 »NY) 

30  RETURN 
END 

SUBROUTINE  PARLINES  <  N  t IX  *  I Y  t ST  *  OT  r  DT  f  BT » FNAME »  LH  » I AN  »  MP  t NW ) 
IMPLICIT  INTEGER*2  < I » J»K»L»M»N) 

COMMON  /LINE/IL ( 128.6)  »SLP  < 128) »CPT< 128>/BC/IC< 128.8) » ID< 128) 
CHARACTER* 10  BRD 
CHARACTER*1  IAN 

CHARACTER*9  FNAME  *  . 

THIS  ROUTINE  EXTRACTS  FROM  THE  SET  OF  LINES  GENERATED  BY 
'INVERT'  THOSE  SETS  OF  LINES  THAT  MAY  BE  BRIDGES.  THESE  ARE 
SETS  OF  LONGf  PARALLEL  LINES  OF  APPROX.  EQUAL  LENGTH  AND 
LOCATION.  IF  IAN«'Y'»  THE  SURVIVING  CANDIDATES  ARE  THEN 
TESTED  BY  'GLOBCON'  TO  BE  OVER  'WATER'  AND  THUS  A  BRIDGE. 
IF<MP«EQ» 1 )  GO  TO  99 
WRITE(6. 1 ) 

1  FORMAT <  '0'  »  '  CTR <  X  »  Y  )  ANGLE  LNGTH  WIDTH  OLINES  RESULT') 

99  DO  2  K»lfN 

2  ID<K)»0 
NP-0 

BRD*'  ' 

DO  20  K-l.N-1 
-  IF<  ID<fO  .EG*  1 )  GO  TO  20 
IE»1 

DO  9  J-K+l.N 

IF(IL<K»2).NE.IL<J'2).OR.ID(J).NE.O)  GO  TO  9 
A«ATAN<SLP<K) ) 

D*  <  COS  <  A ) #  <  CPT  <  K ) -CPT  <  J ) ) ) **2 
AVG*IL<K.5)+IL<  J» 5) 

X*IL(K»3)-IL<Jr3> 

Y-IL<K»4)-IL<J»4> 

0F«X**2  +  Y**2  -  D 

IF<D.GT.<ST**2).OR.OF.GT,(<OT*AVG>**2)/  GO  TO  9 

IF(ABS( CIL<K»5)-IL(Jf5))*2.).GT. (DTftAVG) )  GO  TO  9 

ID(K>*2 

ID< J)*l 

IE-IE+1 

IF(IE.GT .2)  GO  TO  5 
NP-NP+1 

IF <  XLCK.2) .GT *45. AND. XL (K. 2) »LT« 135)  GO  TO  3 


A- 9 


IC(NPfl)®IL<K»3) 

IC<NP*2)=IL<K»4> 

GO  TO  4 

3  IC<NP»1)=IL(K»4> 

IC<NP*2>=IL<K*3> 

4  IC<NP.3)®IL<K*2> 

IC(NP»4)®IL(K»5> 

IC(NP*7)=IL(K»1) 

5  IF<IL(Nf2).GT.45« AND* IL<K?2).LT*135)  GO  TO  6 
IC(NP*1>=IC<NP»1)+IL<J»3> 

IC(NP*2)=IC(NP*2)+IL<J»4> 

GO  TO  7 

6  IC(NP»1)=IC<NP»1)+IL(J»4) 

IC(NPf2)=IC(NP»2)+IL(J»3> 

7  IC(NP»4)=IC<NP»4)+IL<J»5) 

IC<NPt  7>SSIC(NP»7)+IL(  J»  1 ) 

•  D=ABS<D> 

8  IC(NP»5>*SQRT(D> 

9  CONTINUE 

IF< ID  <  K ) « EQ .0)  GO  TO  20 
IC<NP»1)=IC<NP»1)/IE  +IY-1 
IC<NP»2)=IC(NP*2)/IE  +IX-1 
IC<NP»4>=IC(NP»4>/IE  *  2 
IC(NP»6)*IE 
IC(NP»7)=IC<NP»7>/IE 
DO  13  J=1 »NP-1 

IF ( <  ABS  < IC  <  NP» l)-IC(Jfl)) +ABS< IC( NP»2)-IC( J»2) ) ) »GT»5) 
*  GO  TO  13 
L«0 

IF< ICCNP* 6) »GT . IC( J»6) )  L-L+l 
IF<IC(NPf7>.GT.IC(J*7>>  L*L+1 
IF( IC(NP»4) . GT.IC(J»4)>  L»L+1 
IF (L.GE.2)  GO  TO  12 
NP«NP-1 
GO  TO  20 

12  IC<J'6)»0 

13  CONTINUE 
20  CONTINUE 

DO  25  K-1'NP 

IF(IC(Kr&) . EQ.O)  GO  TO  25 

IF < IAN.EQ. 'Y'  >  CALL  GLOBCON <  BT  »  LH  r FNAHE » NW » K » BRD ) 

WRITE ( 6  »24 )  <IC(K»L)»L=1»6) »BRD 

24  FORMAT < '  ' 1 14 r ' t ' » 14 *416, 2X» A10) 

25  CONTINUE 
RETURN 
END 

SUBROUT I NE  GLOBCON  <  BT * LH  * FNAME * NW  * NP  * BRD  > 

IMPLICIT  INTEGER*2  ( I  * J*K*L*M*N) 

-THIS  ROUTINE  PERFORMS  A  GLOBAL  CONTRAST  TEST  ABOUT  A 
-BRIDGE  'CANDIDATE'.  IT  COMPARES  THE  UNIFORMITY  WITHIN 
-A  NW  X  NW  WINDOW  AT  PTS  ON  EITHER  SIDE  OF  THE  'BRIDGE'.  IF 
-THE  UNIFORMITY  OF  BOTH  SIDES  ARE  BELOW  A  THRESHOLD (BT) 

-AND  THE  MEAN  BRIGHTNESS  ON  BOTH  SIDES  ARE  APPROX.  EQUAL * 
-THE  CANDIDATE  IS  JUDGED  TO  BE  OVER  WATER*  THUS  A  'BRIDGE' 
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COMMON  /BC/IC ( 128 »8) f ID< 128) 

CHARACTER*10  BRD 
CHARACTER*?  FNAME 
BRD='BRIDGE  ' 

PI*ATAN(1.0>*4. 

ANG=IC(NPf 3)*PI/180. 

DIST=(NW*2+2+IC(NPf 5)/2. ) 

IXL*IC(NP f 1 )-DIST*SIN ( ANG) 

IYL=IC(NPf2)-DIST*C0S(ANG) 

IXR=IC<NP » 1 >  +DIST*SIN( ANG) 

IYR*IC(NPf 2)+DIST*C0S( ANG) 

CALL  TEX<LHf IR f FNAMEf IXLr IYL? El f  VI f NWf BRD) 

CALL  TEX < LH . I R » FNAME » IXR » I YR » E2 » V2 r NW » BRD ) 

IF(BRD.EQ. ' INCQNCLUS. ' )  GO  TO  20 
DM*ABS( (El-E2)/2* ) 

IF(V1.GT.BT.0R.V2.GT.BT.0R.DM.GT.BT>  BRD*'NOT  BRIDGE' 

20  RETURN 
END 

SUBROUTINE  TEX (LH t IR » FNAME flfKfEfVfNWf BRD > 

IMPLICIT  INTEGER*2  ( I f Jf KrLf MrN) 

CHARACTER* 10  BRD 
CHARACTER*?  FNAME 
DIMENSION  IR(LH) 

-THIS  ROUTINE  CALCULATES  THE  MEAN  BRIGHTNESS  AND  UNIFORMITY 
-< IEf  STD. DEV. OR. BRIGHTNESS)  WITHIN  A  NWXNW  WINDOW  CENTERED 
-ON  If K.  THIS  'TEXTURE'  MEASURE  IS  USED  IN  THE  GLOBCON  TEST. 
E*0 
V-0 

OPEN (ACCESS* 'DIRECT' fNAME*FNAMEf TYPE* 'OLD' fUNIT*21) 

-THE  NEXT  LINE  SEES  IF  THE  POINT  IS  TOO  CLOSE  TO  THE  EDGE. 
IF(K.LE.NW. OR . I . LE . NW « OR • I • GT . ( LH-NW ) )  GO  TO  10 
WS*(2*NW+1)**2 
DO  5  L*K-NWf K+NW 
READ (21 'LfEND*10>  IR 
DO  5  M*I-NWf I+NW 
E*E+IR(M) 

5  V*V+FLOAT(IR(M) )**2 
E»£/WS 

V*SQRT( V/WS  -E**2) 

GO  TO  20 

10  BRD* ' INCONCLUS . ' 

20  CLOSE (UN IT-21) 

RETURN 

END 
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To  cope  with  the  expanding  technology,  our  society  must 
be  assured  of  a  continuing  supply  of  rigorously  trained 
and  educated  engineers.  The  School  of  Engineering  and 
Applied  Science  is  completely  committed  to  this  ob¬ 
jective. 


